Logistic Regression Analysis of LC-MS/MS Data of Monomers Eluted from Aged Dental Composites: A Supervised Machine-Learning Approach

Compound identification by database searching that matches experimental with library mass spectra is commonly used in mass spectrometric (MS) data analysis. Vendor software often outputs scores that represent the quality of each spectral match for the identified compounds. However, software-generate...

Full description

Saved in:
Bibliographic Details
Published in:Analytical chemistry (Washington) Vol. 95; no. 12; pp. 5205 - 5213
Main Authors: Chen, Chien-chia, Mondal, Karabi, Vervliet, Philippe, Covaci, Adrian, O’Brien, Evan P., Rockne, Karl J., Drummond, James L., Hanley, Luke
Format: Journal Article
Language:English
Published: United States American Chemical Society 28-03-2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Compound identification by database searching that matches experimental with library mass spectra is commonly used in mass spectrometric (MS) data analysis. Vendor software often outputs scores that represent the quality of each spectral match for the identified compounds. However, software-generated identification results can differ drastically depending on the initial search parameters. Machine learning is applied here to provide a statistical evaluation of software-generated compound identification results from experimental tandem MS data. This task was accomplished using the logistic regression algorithm to assign an identification probability value to each identified compound. Logistic regression is usually used for classification, but here it is used to generate identification probabilities without setting a threshold for classification. Liquid chromatography coupled with quadrupole-time-of-flight tandem MS was used to analyze the organic monomers leached from resin-based dental composites in a simulated oral environment. The collected tandem MS data were processed with vendor software, followed by statistical evaluation of these results using logistic regression. The assigned identification probability to each compound provides more confidence in identification beyond solely by database matching. A total of 21 distinct monomers were identified among all samples, including five intact monomers and chemical degradation products of bisphenol A glycidyl methacrylate (BisGMA), oligomers of bisphenol-A ethoxylate methacrylate (BisEMA), triethylene glycol dimethacrylate (TEGDMA), and urethane dimethacrylate (UDMA). The logistic regression model can be used to evaluate any database-matched liquid chromatography-tandem MS result by training a new model using analytical standards of compounds present in a chosen database and then generating identification probabilities for candidates from unknown data using the new model.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0003-2700
1520-6882
DOI:10.1021/acs.analchem.2c04362