Detecting and exploiting etiologic heterogeneity in epidemiologic studies.

Overview

Contemporary searches for new risk factors frequently involve genome-wide explorations of very large numbers of candidate risk variants. Given that diseases can often be classified into subtypes that possess evidence of etiologic heterogeneity, the question arises as to whether or not a search for new risk factors would be improved by looking separately within subtypes. Etiologic risk heterogeneity inevitably increases the signal in at least one of the subtypes, but this advantage may be offset by smaller sample sizes and the increased chances of false discovery. In this article, the authors show that only a relatively modest degree of etiologic heterogeneity is necessary for the subtyping strategies to have improved statistical power. In practice, effective exploitation of etiologic heterogeneity requires strong evidence that the subtypes selected are likely to exhibit substantial heterogeneity. Further, defining the subtypes that demonstrate the most heterogeneous profiles is important for optimizing the search for new risk factors. The concepts are illustrated by using data from a breast cancer study in which results are available separately for estrogen receptor-positive (ER+) and -negative (ER-) tumors.