Implications of the Use of Artificial Intelligence Predictive Models in Health Care Settings : A Simulation Study.

Overview

abstract

BACKGROUND: Substantial effort has been directed toward demonstrating uses of predictive models in health care. However, implementation of these models into clinical practice may influence patient outcomes, which in turn are captured in electronic health record data. As a result, deployed models may affect the predictive ability of current and future models. OBJECTIVE: To estimate changes in predictive model performance with use through 3 common scenarios: model retraining, sequentially implementing 1 model after another, and intervening in response to a model when 2 are simultaneously implemented. DESIGN: Simulation of model implementation and use in critical care settings at various levels of intervention effectiveness and clinician adherence. Models were either trained or retrained after simulated implementation. SETTING: Admissions to the intensive care unit (ICU) at Mount Sinai Health System (New York, New York) and Beth Israel Deaconess Medical Center (Boston, Massachusetts). PATIENTS: 130 000 critical care admissions across both health systems. INTERVENTION: Across 3 scenarios, interventions were simulated at varying levels of clinician adherence and effectiveness. MEASUREMENTS: Statistical measures of performance, including threshold-independent (area under the curve) and threshold-dependent measures. RESULTS: At fixed 90% sensitivity, in scenario 1 a mortality prediction model lost 9% to 39% specificity after retraining once and in scenario 2 a mortality prediction model lost 8% to 15% specificity when created after the implementation of an acute kidney injury (AKI) prediction model; in scenario 3, models for AKI and mortality prediction implemented simultaneously, each led to reduced effective accuracy of the other by 1% to 28%. LIMITATIONS: In real-world practice, the effectiveness of and adherence to model-based recommendations are rarely known in advance. Only binary classifiers for tabular ICU admissions data were simulated. CONCLUSION: In simulated ICU settings, a universally effective model-updating approach for maintaining model performance does not seem to exist. Model use may have to be recorded to maintain viability of predictive modeling. PRIMARY FUNDING SOURCE: National Center for Advancing Translational Sciences.