Using biological constraints to improve prediction in precision oncology. Academic Article uri icon

Overview

abstract

  • Many gene signatures have been developed by applying machine learning (ML) on omics profiles, however, their clinical utility is often hindered by limited interpretability and unstable performance. Here, we show the importance of embedding prior biological knowledge in the decision rules yielded by ML approaches to build robust classifiers. We tested this by applying different ML algorithms on gene expression data to predict three difficult cancer phenotypes: bladder cancer progression to muscle-invasive disease, response to neoadjuvant chemotherapy in triple-negative breast cancer, and prostate cancer metastatic progression. We developed two sets of classifiers: mechanistic, by restricting the training to features capturing specific biological mechanisms; and agnostic, in which the training did not use any a priori biological information. Mechanistic models had a similar or better testing performance than their agnostic counterparts, with enhanced interpretability. Our findings support the use of biological constraints to develop robust gene signatures with high translational potential.

publication date

  • February 2, 2023

Identity

PubMed Central ID

  • PMC8152575

Scopus Document Identifier

  • 85147962621

Digital Object Identifier (DOI)

  • 10.1016/j.isci.2023.106108

PubMed ID

  • 36852282

Additional Document Info

volume

  • 26

issue

  • 3