Development of a Machine Learning Algorithm for Drug Screening Analysis on High-Resolution UPLC-MSE/QTOF Mass Spectrometry.
Academic Article
Overview
abstract
BACKGROUND: Ultra-performance liquid chromatography (UPLC)-MSE/quadrupole time-of-flight (QTOF) high-resolution mass spectrometry employs untargeted, data-independent acquisition in a dual mode that simultaneously collects precursor ions and product ions at low and ramped collision energies, respectively. However, algorithmic analysis of large-scale multivariate data of comprehensive drug screening as well as the positivity criteria of drug identification have not been systematically investigated. It is also unclear whether ion ratio (IR), the intensity ratio of a defined product ion divided by the precursor ion, is a stable parameter that can be incorporated into the MSE/QTOF data analysis algorithm. METHODS: IR of 91 drugs were experimentally determined and variation of IR was investigated across 5 concentrations measured on 3 different days. A data-driven machine learning approach was employed to develop multivariate linear regression (MLR) models incorporating mass error, retention time, number of detected fragment ions and IR, accuracy of isotope abundance, and peak response using drug-supplemented urine samples. Performance of the models was evaluated in an independent data set of unknown clinical urine samples in comparison with the results of manual analysis. RESULTS: IR of most compounds acquired by MSE/QTOF were low and concentration-dependent (i.e., IR increased at higher concentrations). We developed an MLR model with composite score outputs incorporating 7 parameters to predict positive drug identification. The model achieved a mean accuracy of 89.38% in the validation set and 87.92% agreement in the test set. CONCLUSIONS: The MLR model incorporating all contributing parameters can serve as a decision-support tool to facilitate objective drug identification using UPLC-MSE/QTOF.