Using association rule mining for phenotype extraction from electronic health records.
The increasing adoption of electronic health records (EHRs) due to Meaningful Use is providing unprecedented opportunities to enable secondary use of EHR data. Significant emphasis is being given to the development of algorithms and methods for phenotype extraction from EHRs to facilitate population-based studies for clinical and translational research. While preliminary work has shown demonstrable progress, it is becoming increasingly clear that developing, implementing and testing phenotyping algorithms is a time- and resource-intensive process. To this end, in this manuscript we propose an efficient machine learning technique-distributional associational rule mining (ARM)-for semi-automatic modeling of phenotyping algorithms. ARM provides a highly efficient and robust framework for discovering the most predictive set of phenotype definition criteria and rules from large datasets, and compared to other machine learning techniques, such as logistic regression and support vector machines, our preliminary results indicate not only significantly improved performance, but also generation of rule patterns that are amenable to human interpretation .