Development and Evaluation of Computable Phenotypes in Pediatric Epilepsy:3 Cases.
Academic Article
Overview
abstract
INTRODUCTION: Computable phenotypes allow identification of well-defined patient cohorts from electronic health record data. Little is known about the accuracy of diagnostic codes for important clinical concepts in pediatric epilepsy, such as (1) risk factors like neonatal hypoxic-ischemic encephalopathy; (2) clinical concepts like treatment resistance; (3) and syndromes like juvenile myoclonic epilepsy. We developed and evaluated the performance of computable phenotypes for these examples using electronic health record data at one center. METHODS: We identified gold standard cohorts for neonatal hypoxic-ischemic encephalopathy, pediatric treatment-resistant epilepsy, and juvenile myoclonic epilepsy via existing registries and review of clinical notes. From the electronic health record, we extracted diagnostic and procedure codes for all children with a diagnosis of epilepsy and seizures. We used these codes to develop computable phenotypes and evaluated by sensitivity, positive predictive value, and the F-measure. RESULTS: For neonatal hypoxic-ischemic encephalopathy, the best-performing computable phenotype (HIE ICD-9/10 and [brain magnetic resonance imaging (MRI) or electroencephalography (EEG) within 120 days of life] and absence of commonly miscoded conditions) had high sensitivity (95.7%, 95% confidence interval [CI] 85-99), positive predictive value (100%, 95% CI 95-100), and F measure (0.98). For treatment-resistant epilepsy, the best-performing computable phenotype (3 or more antiseizure medicines in the last 2 years or treatment-resistant ICD-10) had a sensitivity of 86.9% (95% CI 79-93), positive predictive value of 69.6% (95% CI 60-79), and F-measure of 0.77. For juvenile myoclonic epilepsy, the best performing computable phenotype (JME ICD-10) had poor sensitivity (52%, 95% CI 43-60) but high positive predictive value (90.4%, 95% CI 81-96); the F measure was 0.66. CONCLUSION: The variable accuracy of our computable phenotypes (hypoxic-ischemic encephalopathy high, treatment resistance medium, and juvenile myoclonic epilepsy low) demonstrates the heterogeneity of success using administrative data to identify cohorts important for pediatric epilepsy research.