The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models. Academic Article

Overview

abstract

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

authors

Campbell, Gregory

Jones, Wendell D

Campagne, Fabien

Walker, Stephen J

Goodsaid, Federico M

Shaughnessy, John D

Oberthuer, André

Thomas, Russell S

Paules, Richard S

Fischer, Matthias

Furlanello, Cesare

Gallas, Brandon D

Megherbi, Dalila B

Symmans, W Fraser

Brors, Benedikt

Bushel, Pierre R

Davison, Timothy S

Delorenzi, Mauro

Devanarayan, Viswanath

Dopazo, Joaquin

Gonzaludo, Nina

Hess, Kenneth R

Irizarry, Rafael A

Judson, Richard

Juraeva, Dilafruz

Lababidi, Samir

Lambert, Christophe G

Lobenhofer, Edward K

McCall, Matthew N

Pennello, Gene A

Perkins, Roger G

Price, Nathan D

Scherer, Andreas

Thierry-Mieg, Danielle

Thierry-Mieg, Jean

Thodima, Venkata

Vishnuvajjala, Lakshmi

Yousef, Waleed A

Arasappan, Dhivya

Lucas, Anne Bergstrom

Berthold, Frank

Brennan, Richard J

Buness, Andreas

Catalano, Jennifer G

Demichelis, Francesca

Dosymbekov, Damir

Fostel, Jennifer

Fulmer-Smentek, Stephanie

Fuscoe, James C

Goldstein, Darlene R

Halbert, Donald N

Harris, Stephen C

Hatzis, Christos

Huang, Jianping

Jensen, Roderick V

Johnson, Charles D

Jurman, Giuseppe

Kahlert, Yvonne

Khuder, Sadik A

Martinez-Murillo, Francisco

Medina, Ignacio

Moffitt, Richard A

Montaner, David

publication date

July 30, 2010

published in

Nature biotechnology Journal

Research

keywords

Liver Diseases
Lung Diseases
Neoplasms
Oligonucleotide Array Sequence Analysis

Identity

PubMed Central ID

PMC3315840

Scopus Document Identifier

78650735473

Digital Object Identifier (DOI)

10.1038/nbt.1665

PubMed ID

20676074

Additional Document Info

has global citation frequency

737

volume

28

issue

8