Develop and validate a computable phenotype for the identification of Alzheimer's disease patients using electronic health record data. Academic Article uri icon

Overview

abstract

  • INTRODUCTION: Alzheimer's disease (AD) is often misclassified in electronic health records (EHRs) when relying solely on diagnosis codes. This study aimed to develop a more accurate, computable phenotype (CP) for identifying AD patients using structured and unstructured EHR data. METHODS: We used EHRs from the University of Florida Health (UFHealth) system and created rule-based CPs iteratively through manual chart reviews. The CPs were then validated using data from the University of Texas Health Science Center at Houston (UTHealth) and the University of Minnesota (UMN). RESULTS: Our best-performing CP was "patient has at least 2 AD diagnoses and AD-related keywords in AD encounters," with an F1-score of 0.817 at UF, 0.961 at UTHealth, and 0.623 at UMN, respectively. DISCUSSION: We developed and validated rule-based CPs for AD identification with good performance, which will be crucial for studies that aim to use real-world data like EHRs. HIGHLIGHTS: Developed a computable phenotype (CP) to identify Alzheimer's disease (AD) patients using EHR data.Utilized both structured and unstructured EHR data to enhance CP accuracy.Achieved a high F1-score of 0.817 at UFHealth, and 0.961 and 0.623 at UTHealth and UMN.Validated the CP across different demographics, ensuring robustness and fairness.

publication date

  • July 3, 2024

Identity

PubMed Central ID

  • PMC11220631

Digital Object Identifier (DOI)

  • 10.1002/dad2.12613

PubMed ID

  • 38966622

Additional Document Info

volume

  • 16

issue

  • 3