Using somatic variant richness to mine signals from rare variants in the cancer genome. Academic Article uri icon

Overview

abstract

  • To date, the vast preponderance of somatic variants observed in the cancer genome have been rare variants, and it is common in practice to encounter in a new tumor variants that have not been observed previously. Here we focus on probability estimation for encountering such hitherto unseen variants. We draw upon statistical methodology that has been developed in other fields of study, notably in species estimation in ecology, and word frequency estimation in computational linguistics. Analysis of whole-exome and targeted panel sequencing data sets reveal substantial variability in variant "richness" between genes that could be harnessed for clinically relevant problems. We quantify the variant-tissue association and show a strong gene-specific, lineage-dependent pattern of encountering new variants. This variability is largely determined by the proportion of observed variants that are rare. Our findings suggest that variants that occur at very low frequencies can harbor important signals that are clinically consequential.

publication date

  • December 3, 2019

Research

keywords

  • Genetic Predisposition to Disease
  • Genome, Human
  • Mutation
  • Neoplasms

Identity

PubMed Central ID

  • PMC6890761

Scopus Document Identifier

  • 85076097314

Digital Object Identifier (DOI)

  • 10.1038/s41467-019-13402-z

PubMed ID

  • 31796730

Additional Document Info

volume

  • 10

issue

  • 1