RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci. Academic Article uri icon

Overview

abstract

  • Expansions of tandem repeats (TRs) cause approximately 60 monogenic diseases. We expect that the discovery of additional pathogenic repeat expansions will narrow the diagnostic gap in many diseases. A growing number of TR expansions are being identified, and interpreting them is a challenge. We present RExPRT (Repeat EXpansion Pathogenicity pRediction Tool), a machine learning tool for distinguishing pathogenic from benign TR expansions. Our results demonstrate that an ensemble approach classifies TRs with an average precision of 93% and recall of 83%. RExPRT's high precision will be valuable in large-scale discovery studies, which require prioritization of candidate loci for follow-up studies.

publication date

  • January 31, 2024

Research

keywords

  • Machine Learning
  • Tandem Repeat Sequences

Identity

PubMed Central ID

  • PMC10832122

Digital Object Identifier (DOI)

  • 10.1186/s13059-024-03171-4

PubMed ID

  • 38297326

Additional Document Info

volume

  • 25

issue

  • 1