Predicting Post-Stroke Cognitive Impairment (PSCI) Using Multiple Machine Learning Approaches
Abstract Body: Background: Post-stroke cognitive impairment (PSCI) is a condition characterized by cognitive decline that occurs after a stroke. PSCI affects up to 60% of stroke survivors. Early detection of those at high risk for PSCI is essential for timely intervention and personalized care. Electronic health records (EHRs) contain valuable data that can be leveraged by machine learning to predict PSCI, potentially enhancing patient outcomes. This study focuses on developing and validating machine learning models to predict PSCI, aiming to enable earlier diagnosis and improve post-stroke care.
Methods: 7956 all-type stroke patients (including Ischemic & Hemorrhagic stroke) treated between 2012 and 2021 were extracted from Emory Healthcare system. We employed multiple methods to predict PSCI, using ICD codes and prescribed medications that were available up to the discharge of index strokes. First, we utilized traditional machine learning methods, including Logistic Regression, Support Vector Machine, and Random Forest to develop models. Then, we developed hypergraph models to enhance prediction performance. Unlike traditional graphs that only capture pair-wise relationships between pairs of entities, hypergraphs can model the more complex higher-order relationships among multiple entities, by allowing a hyperedge (encounter) to connect multiple vertices (ICD and medications) simultaneously among patient visits and EHR medical features. Finally, we compared the performance across different methods and selected the best one for the PSCI prediction task. We compared their performance on four metrics: ACC (Accuracy, the proportion of correct predictions), AUC (Area Under the ROC Curve, measuring the model's ability to distinguish between classes), AUPR (Area Under the Precision-Recall Curve, a comprehensive measure considering both precision and recall), and Macro-F1 (a balanced measure calculated by the harmonic mean of precision and recall).
Results: We included 7956 all-type stroke patients (50% female, 56% non-white) in this analysis, where 1797 (23%) had diagnostic codes often used by clinicians at Emory to document PSCI. According to the performance, the hypergraph model was associated with higher ACC, AUC, AUPR, and Macro-F1 than other models.
Conclusion: By comparing the results of various machine learning methods, we found that hypergraph model approaches outperform traditional machine learning methods in utilizing EHRs for predicting PSCI after a stroke.
Xie, Yuzhang
( EMORY UNIVERSITY
, Atlanta
, Georgia
, United States
)
Nahab, Fadi
( EMORY UNIVERSITY
, Atlanta
, Georgia
, United States
)
Ge, Yi
( UNIVERSITY OF CALIFORNIA, BERKELEY
, Berkeley
, California
, United States
)
Wu, Yuhua
( EMORY UNIVERSITY
, Atlanta
, Georgia
, United States
)
Saurman, Jessica
( EMORY UNIVERSITY
, Atlanta
, Georgia
, United States
)
Yang, Carl
( EMORY UNIVERSITY
, Atlanta
, Georgia
, United States
)
Hu, Xiao
( EMORY UNIVERSITY
, Atlanta
, Georgia
, United States
)
Author Disclosures:
Yuzhang Xie:DO NOT have relevant financial relationships
| Fadi Nahab:DO NOT have relevant financial relationships
| Yi Ge:DO NOT have relevant financial relationships
| Yuhua Wu:No Answer
| Jessica Saurman:DO NOT have relevant financial relationships
| Carl Yang:DO NOT have relevant financial relationships
| Xiao Hu:No Answer