Logo

American Heart Association

  10
  0


Final ID: LBP65

Using a Large-Language Model to Predict 3-month Functional Outcomes After Intracerebral Hemorrhage

Abstract Body: Introduction: Large language models (LLMs) have shown early promise for predictive clinical applications, but their ability to predict functional outcomes after stroke has been inadequately explored. We evaluated an LLM's performance in predicting 3-month modified Rankin Scale (mRS) scores in intracerebral hemorrhage (ICH) patients.

Methods: We performed a retrospective cohort study using a prospective institutional database of primary ICH patients over a 2-year period. We included all patients with complete data on demographics, ICH characteristics, and 3-month outcomes. We instructed a LLM (OpenAI GPT-4o mini) to predict each patient’s mRS score at 3 months post-discharge using a multi-shot approach and a model temperature set to 0.7. We developed a standardized LLM prompt that incorporated data from the first 24 hours of each patient's admission, including demographics, comorbidities, ICH characteristics, intubation status, and clinical scales (NIH Stroke Scale and Glasgow Coma Scale). The prompt also incorporated representative real-world examples corresponding to each possible mRS score. We evaluated the model’s performance by calculating correlation coefficients and mean differences between predicted and observed mRS score, and by assessing the model’s classification accuracy in predicting favorable (0-3) 3-month mRS.

Results: We identified 419 patients (51% female, mean age 74.0±13.4 years, median NIHSS 9 [IQR 3-19]). Model predictions were moderately correlated with observed outcomes (r=0.64), with a -0.99 (1.68) point mean difference in predicted versus observed mRS scores (with negative shifts indicating lower predicted scores). These differences were especially prominent in patients with observed mRS scores of 3, 5, and 6 (mean [SD] shifts -1.17 [1.46], -1.36 [1.40], and -1.61 [1.74], respectively). The model had 88% sensitivity and 67% specificity in predicting favorable outcomes, with positive and negative predictive values of 0.68 and 0.88, respectively.

Conclusions: While demonstrating technical feasibility in predicting ICH outcomes using clinical data, a widely-available LLM consistently underestimated 3-month mortality and functional disability, with the most substantive prediction errors in patients who died or had moderate or severe impairments. Future studies should explore other LLMs and model parameters and compare LLM performance with clinician predictions.
  • Mintz, Noa  ( Brown University , New York , New York , United States )
  • Erekat, Asala  ( Icahn School of Medicine at Mount Sinai , Union City , New Jersey , United States )
  • Mahta, Ali  ( Brown University , Providence , Rhode Island , United States )
  • Reznik, Michael  ( University of Pittsburgh , Pittsburgh , Pennsylvania , United States )
  • Kummer, Benjamin  ( MOUNT SINAI HEALTH SYSTEM , New York , New York , United States )
  • Author Disclosures:
    Noa Mintz: DO NOT have relevant financial relationships | Asala Erekat: DO NOT have relevant financial relationships | Ali Mahta: DO NOT have relevant financial relationships | Michael Reznik: DO NOT have relevant financial relationships | Benjamin Kummer: DO NOT have relevant financial relationships
Meeting Info:
Session Info:

Late-Breaking Science Posters

Wednesday, 02/05/2025 , 07:00PM - 07:30PM

Poster Abstract Session

More abstracts on this topic:
Cardiovascular Risk Models Using Large-Scale Physical Examination Data

Fei Jintao, Yao Yang, Zhang Ping, Xu Zhaoji, Liu Mingxue, Wu Xinyue, Wang Ding, Ding Shuhan, Li Minghui, Yuan Yifang, Luo Jiangying, Wang Guoxin, Zhao Jun, Cao Xuyang, Gao Minyu

Cardiometabolic Syndrome and Incident Alzheimer’s Disease: The Predicative Value of Age and CMS Using Cox and Machine Learning Models

Liu Longjian

More abstracts from these authors:
You have to be authorized to contact abstract author. Please, Login
Not Available

Readers' Comments

We encourage you to enter the discussion by posting your comments and questions below.

Presenters will be notified of your post so that they can respond as appropriate.

This discussion platform is provided to foster engagement, and simulate conversation and knowledge sharing.

 

You have to be authorized to post a comment. Please, Login or Signup.


   Rate this abstract  (Maximum characters: 500)