Machine Learning Prediction of Stroke Mechanism in Embolic Strokes of Undetermined Source.

Overview

abstract

BACKGROUND AND PURPOSE: One-fifth of ischemic strokes are embolic strokes of undetermined source (ESUS). Their theoretical causes can be classified as cardioembolic versus noncardioembolic. This distinction has important implications, but the categories' proportions are unknown. METHODS: Using data from the Cornell Acute Stroke Academic Registry, we trained a machine-learning algorithm to distinguish cardioembolic versus non-cardioembolic strokes, then applied the algorithm to ESUS cases to determine the predicted proportion with an occult cardioembolic source. A panel of neurologists adjudicated stroke etiologies using standard criteria. We trained a machine learning classifier using data on demographics, comorbidities, vitals, laboratory results, and echocardiograms. An ensemble predictive method including L1 regularization, gradient-boosted decision tree ensemble (XGBoost), random forests, and multivariate adaptive splines was used. Random search and cross-validation were used to tune hyperparameters. Model performance was assessed using cross-validation among cases of known etiology. We applied the final algorithm to an independent set of ESUS cases to determine the predicted mechanism (cardioembolic or not). To assess our classifier's validity, we correlated the predicted probability of a cardioembolic source with the eventual post-ESUS diagnosis of atrial fibrillation. RESULTS: Among 1083 strokes with known etiologies, our classifier distinguished cardioembolic versus noncardioembolic cases with excellent accuracy (area under the curve, 0.85). Applied to 580 ESUS cases, the classifier predicted that 44% (95% credibility interval, 39%-49%) resulted from cardiac embolism. Individual ESUS patients' predicted likelihood of cardiac embolism was associated with eventual atrial fibrillation detection (OR per 10% increase, 1.27 [95% CI, 1.03-1.57]; c-statistic, 0.68 [95% CI, 0.58-0.78]). ESUS patients with high predicted probability of cardiac embolism were older and had more coronary and peripheral vascular disease, lower ejection fractions, larger left atria, lower blood pressures, and higher creatinine levels. CONCLUSIONS: A machine learning estimator that distinguished known cardioembolic versus noncardioembolic strokes indirectly estimated that 44% of ESUS cases were cardioembolic.

authors

Kamel, Hooman
Navi, Babak Benjamin
Duong, Cam Minh Duc
Merkler, Alexander
Okin, Peter M
Devereux, Richard B
Weinsaft, Jonathan W
Kim, Jiwon
Cheung, Jim W.
Kim, Luke
Casadei, Barbara
Iadecola, Costantino
Sabuncu, Mert R
Gupta, Ajay
Díaz, Iván

publication date

August 12, 2020

published in

Stroke Journal

Research

keywords

Intracranial Embolism
Machine Learning
Stroke

Identity

PubMed Central ID

PMC8034802

Scopus Document Identifier

85089922812

Digital Object Identifier (DOI)

10.1161/STROKEAHA.120.029305

PubMed ID

32781943

Additional Document Info

has global citation frequency

41

volume

51

issue

9

VIVO Weill Cornell Medical College

Machine Learning Prediction of Stroke Mechanism in Embolic Strokes of Undetermined Source. Academic Article

Overview

abstract

authors

publication date

published in

Research

keywords

Identity

PubMed Central ID

Scopus Document Identifier

Digital Object Identifier (DOI)

PubMed ID

Additional Document Info

has global citation frequency

volume

issue