Analyzing the heterogeneity and complexity of Electronic Health Record oriented phenotyping algorithms. Academic Article uri icon



  • The need for formal representations of eligibility criteria for clinical trials - and for phenotyping more generally - has been recognized for some time. Indeed, the availability of a formal computable representation that adequately reflects the types of data and logic evidenced in trial designs is a prerequisite for the automatic identification of study-eligible patients from Electronic Health Records. As part of the wider process of representation development, this paper reports on an analysis of fourteen Electronic Health Record oriented phenotyping algorithms (developed as part of the eMERGE project) in terms of their constituent data elements, types of logic used and temporal characteristics. We discovered that the majority of eMERGE algorithms analyzed include complex, nested boolean logic and negation, with several dependent on cardinality constraints and complex temporal logic. Insights gained from the study will be used to augment the CDISC Protocol Representation Model.

publication date

  • October 22, 2011



  • Algorithms
  • Electronic Health Records
  • Hashimoto Disease
  • Hypothyroidism


PubMed Central ID

  • PMC3243189

Scopus Document Identifier

  • 84870749672

PubMed ID

  • 22195079

Additional Document Info


  • 2011