Benchmarking Relatedness Inference Methods with Genome-Wide Data from Thousands of Relatives. Academic Article uri icon

Overview

abstract

  • Inferring relatedness from genomic data is an essential component of genetic association studies, population genetics, forensics, and genealogy. While numerous methods exist for inferring relatedness, thorough evaluation of these approaches in real data has been lacking. Here, we report an assessment of 12 state-of-the-art pairwise relatedness inference methods using a data set with 2485 individuals contained in several large pedigrees that span up to six generations. We find that all methods have high accuracy (92-99%) when detecting first- and second-degree relationships, but their accuracy dwindles to <43% for seventh-degree relationships. However, most identical by descent (IBD) segment-based methods inferred seventh-degree relatives correct to within one relatedness degree for >76% of relative pairs. Overall, the most accurate methods are Estimation of Recent Shared Ancestry (ERSA) and approaches that compute total IBD sharing using the output from GERMLINE and Refined IBD to infer relatedness. Combining information from the most accurate methods provides little accuracy improvement, indicating that novel approaches, such as new methods that leverage relatedness signals from multiple samples, are needed to achieve a sizeable jump in performance.

publication date

  • July 24, 2017

Research

keywords

  • Benchmarking
  • Genome-Wide Association Study
  • Genotyping Techniques
  • Pedigree
  • Population

Identity

PubMed Central ID

  • PMC5586387

Scopus Document Identifier

  • 85028958051

Digital Object Identifier (DOI)

  • 10.1534/genetics.117.1122

PubMed ID

  • 28739658

Additional Document Info

volume

  • 207

issue

  • 1