A simple and reproducible breast cancer prognostic test. Academic Article uri icon

Overview

abstract

  • BACKGROUND: A small number of prognostic and predictive tests based on gene expression are currently offered as reference laboratory tests. In contrast to such success stories, a number of flaws and errors have recently been identified in other genomic-based predictors and the success rate for developing clinically useful genomic signatures is low. These errors have led to widespread concerns about the protocols for conducting and reporting of computational research. As a result, a need has emerged for a template for reproducible development of genomic signatures that incorporates full transparency, data sharing and statistical robustness. RESULTS: Here we present the first fully reproducible analysis of the data used to train and test MammaPrint, an FDA-cleared prognostic test for breast cancer based on a 70-gene expression signature. We provide all the software and documentation necessary for researchers to build and evaluate genomic classifiers based on these data. As an example of the utility of this reproducible research resource, we develop a simple prognostic classifier that uses only 16 genes from the MammaPrint signature and is equally accurate in predicting 5-year disease free survival. CONCLUSIONS: Our study provides a prototypic example for reproducible development of computational algorithms for learning prognostic biomarkers in the era of personalized medicine.

publication date

  • May 17, 2013

Research

keywords

  • Breast Neoplasms
  • Computational Biology
  • Gene Expression Profiling

Identity

PubMed Central ID

  • PMC3662649

Scopus Document Identifier

  • 84877806149

Digital Object Identifier (DOI)

  • 10.1186/1471-2164-14-336

PubMed ID

  • 23682826

Additional Document Info

volume

  • 14