Multimodal Data Integration Advances Longitudinal Prediction of the Naturalistic Course of Depression and Reveals a Multimodal Signature of Remission During 2-Year Follow-up.
Academic Article
Overview
abstract
BACKGROUND: The ability to predict the disease course of individuals with major depressive disorder (MDD) is essential for optimal treatment planning. Here, we used a data-driven machine learning approach to assess the predictive value of different sets of biological data (whole-blood proteomics, lipid metabolomics, transcriptomics, genetics), both separately and added to clinical baseline variables, for the longitudinal prediction of 2-year remission status in MDD at the individual-subject level. METHODS: Prediction models were trained and cross-validated in a sample of 643 patients with current MDD (2-year remission n = 325) and subsequently tested for performance in 161 individuals with MDD (2-year remission n = 82). RESULTS: Proteomics data showed the best unimodal data predictions (area under the receiver operating characteristic curve = 0.68). Adding proteomic to clinical data at baseline significantly improved 2-year MDD remission predictions (area under the receiver operating characteristic curve = 0.63 vs. 0.78, p = .013), while the addition of other omics data to clinical data did not yield significantly improved model performance. Feature importance and enrichment analysis revealed that proteomic analytes were involved in inflammatory response and lipid metabolism, with fibrinogen levels showing the highest variable importance, followed by symptom severity. Machine learning models outperformed psychiatrists' ability to predict 2-year remission status (balanced accuracy = 71% vs. 55%). CONCLUSIONS: This study showed the added predictive value of combining proteomic data, but not other omics data, with clinical data for the prediction of 2-year remission status in MDD. Our results reveal a novel multimodal signature of 2-year MDD remission status that shows clinical potential for individual MDD disease course predictions from baseline measurements.