Machine Learning Analysis of EMS Data leads to Accurate and Timely Stroke and Severe Stroke Detection
Abstract Body (Do not enter title and authors here): Background Stroke is a leading cause of long-term disability and death, where time to treatment drastically affects outcomes. Rapid detection and transport to appropriate stroke centers is essential, particularly for severe stroke subtypes such as large vessel occlusion or subarachnoid hemorrhage. Emergency Medical Services (EMS) are the first point of contact for many stroke patients, making prehospital triage an important determinant of timely care. Current screening tools such as the Cincinnati Prehospital Stroke Scale are inconsistently applied, lack sensitivity, and do not account for stroke severity. Machine learning (ML)-based clinical decision support tools could enable earlier and more accurate stroke detection in real time. Hypothesis We hypothesized that analysis of prehospital EMS data using ML could lead to accurate and timely recognition of stroke and its subtypes with superior accuracy compared to existing screening tools. Methods We conducted a retrospective analysis of 8,796 EMS encounters from 4,754 unique patients transported to a university-affiliated emergency department between 2015-2020. Stroke (n=192; 2.2%) and severe stroke (n=131; 1.5%) outcomes were determined using ICD-10 and CPT codes. Inputs for ML model included demographics, vital signs, and dispatch characteristics. Three ML models – random forest (RF), XGBoost (XGB), and a sequential neural network (SNN) – were trained for binary classification of stroke and severe stroke. Performance was assessed using ROC-AUC and PR-AUC with 5,000 bootstrap resamples. Sensitivity and specificity were evaluated across thresholds. SHAP values were used to interpret model predictions and identify influential features. Results RF performed best for stroke (ROC-AUC: 0.827 [95% CI: 0.771-0.881]; PR-AUC: 0.230), while XGB performed best for severe stroke (ROC-AUC: 0.871 [95% CI: 0.803-0.929]; PR-AUC: 0.237), as shown in Figures 1 and 2. Literature-reported CPSS (81.1% sensitivity, 51.7% specificity) and VAN (81% sensitivity, 38% specificity) benchmarks were overlaid for visual comparison. ML models showed improved performance and favorable tradeoffs across thresholds. Top features included systolic/diastolic blood pressure, Glasgow Coma Scale, pulse, age, and dispatch codes. Conclusions Machine learning using structured EMS data improves prehospital stroke and severe stroke detection. Future models incorporating glucose, ECG results, or free-text notes may improve precision and support real-time triage.
Saban, Michael
( Loyola University Chicago
, Maywood
, Illinois
, United States
)
Akbilgic, Oguz
( Wake Forest School of Medicine
, Lewisville
, North Carolina
, United States
)
Hiura, Grant
( Duke University Health System
, Durham
, North Carolina
, United States
)
De La Pena, Paula
( Loyola University Medical Center
, Maywood
, Illinois
, United States
)
Heiferman, Daniel
( Edward-Elmhurst Health
, Naperville
, Illinois
, United States
)
Cichon, Mark
( Loyola University Medical Center
, Maywood
, Illinois
, United States
)
Tootooni, M. Samie
( Loyola University Chicago
, Maywood
, Illinois
, United States
)
Author Disclosures:
Michael Saban:DO NOT have relevant financial relationships
| Oguz Akbilgic:DO NOT have relevant financial relationships
| Grant Hiura:DO NOT have relevant financial relationships
| Paula de la Pena:DO NOT have relevant financial relationships
| Daniel Heiferman:No Answer
| Mark Cichon:No Answer
| M. Samie Tootooni:DO NOT have relevant financial relationships