Unsupervised analysis of transcriptomic profiles reveals six glioma subtypes.

Overview

abstract

Gliomas are the most common type of primary brain tumors in adults and a significant cause of cancer-related mortality. Defining glioma subtypes based on objective genetic and molecular signatures may allow for a more rational, patient-specific approach to therapy in the future. Classifications based on gene expression data have been attempted in the past with varying success and with only some concordance between studies, possibly due to inherent bias that can be introduced through the use of analytic methodologies that make a priori selection of genes before classification. To overcome this potential source of bias, we have applied two unsupervised machine learning methods to genome-wide gene expression profiles of 159 gliomas, thereby establishing a robust glioma classification model relying only on the molecular data. The model predicts for two major groups of gliomas (oligodendroglioma-rich and glioblastoma-rich groups) separable into six hierarchically nested subtypes. We then identified six sets of classifiers that can be used to assign any given glioma to the corresponding subtype and validated these classifiers using both internal (189 additional independent samples) and two external data sets (341 patients). Application of the classification system to the external glioma data sets allowed us to identify previously unrecognized prognostic groups within previously published data and within The Cancer Genome Atlas glioblastoma samples and the different biological pathways associated with the different glioma subtypes offering a potential clue to the pathogenesis and possibly therapeutic targets for tumors within each subtype.