Logo

American Heart Association

  1
  0


Final ID: 134

Use of Large Language Model to Allow Reliable Data Acquisition for International Pediatric Stroke Study

Abstract Body: Introduction- Pediatric stroke research is hindered by lack of funding and relative disease rarity. Shared data in pediatric stroke is done via non-reimbursed data input by clinical investigators at participating children’s hospitals with the International Pediatric Stroke Study (IPSS). Large Language Models (LLM) can potentially reduce investigator workload through automated data entry. In prior research, investigators were able to achieve 94% accuracy while using a prompt engineering approach with Generative Pretrained Transformer 4 (GPT4) to enter subject outcome forms of the IPSS using clinical notes. However, GPT4 performed only moderately (~50% correct) while attempting to answer some of the data questions. In this study we aim to utilize another toolkit called the “Instructor” to improve the performance of the LLM in areas where the prior method achieved less than 90% accuracy.
Methods- This retrospective study used de-identified clinical notes of 50 patients who presented to UTHealth Pediatric Stroke Clinic between January 2020 and July 2023 with ischemic stroke. Each note was run through the offline HIPAA compliant LLM “GPT4o” to answer questions in the outcome form of IPSS. We focused on areas of the IPSS outcome form where prior approach yielded less than 90% accuracy. We implemented the "Instructor", a Python library built on Pydantic, to enhance prompt engineering and ensure structured outputs. Accuracy was measured as percent agreement between the LLM generated and investigator-entered data. We used simple descriptive statistics to compare the accuracy (% correct) of Instructor method with clinical investigator-entered data and previously reported results from traditional prompt engineering method.
Results- We analyzed neurological deficit severity and post discharge rehabilitation questions. This algorithm reported 100% accuracy for both neurological deficit severity and post discharge rehabilitation as compared to accuracy with the previous method (46-54% and 26-62% respectively).
Conclusion- In this study, utilization of the “Instructor” shows promising results for reliable data retrieval. Moving forward, we will use Instructor to analyze the neurological deficit type, follow-up imaging type and findings based on imaging, and expand this approach to other sections of the IPSS forms. LLMs may reduce investigator workload and increase the efficiency of observational research for rare, underserved diseases like pediatric stroke in the future.
  • Bhayana, Kriti  ( UT Medical School at Houston , Houston , Texas , United States )
  • Wang, Dulin  ( UTHealth , Pearland , Texas , United States )
  • Jiang, Xiaoqian  ( UTHealth , Pearland , Texas , United States )
  • Fraser, Stuart  ( UT Medical School at Houston , Houston , Texas , United States )
  • Author Disclosures:
    Kriti Bhayana: DO NOT have relevant financial relationships | Dulin Wang: DO NOT have relevant financial relationships | Xiaoqian Jiang: DO NOT have relevant financial relationships | Stuart Fraser: DO NOT have relevant financial relationships
Meeting Info:
Session Info:

Pediatric Cerebrovascular Disease Oral Abstracts

Friday, 02/07/2025 , 07:30AM - 09:00AM

Oral Abstract Session

More abstracts on this topic:

A blood test based on RNA-seq and machine learning for the detection of steatotic liver disease: A Pilot Study on Cardiometabolic Health

Poggio Rosana, Berdiñas Ignacio, La Greca Alejandro, Luzzani Carlos, Miriuka Santiago, Rodriguez-granillo Gaston, De Lillo Florencia, Rubilar Bibiana, Hijazi Razan, Solari Claudia, Rodríguez Varela María Soledad, Mobbs Alan, Manchini Estefania

Achieving Fast Assessment and Stroke Triage in Kids (FAST-Kids)

Pearson Rachel, Hills Nancy, Bacon Kellie, Shelton Shelby, Moreno Tatiana, Kuchherzki Maria, Heyming Theodore, Schultz Carl, Fullerton Heather

More abstracts from these authors:
Recurrent Intracerebral Hemorrhage in the Young- Fault in the Genes?

Bhayana Kriti

Post Acute Care at Inpatient Rehabilitation Facilities Maximizes 1-Year Home Time among Patients with Acute Ischemic Stroke: Cluster Analysis of Health System and Statewide Data

Pan Alan, Kim Yejin, Jiang Xiaoqian, Vahidy Farhaan, Wozny Joseph, Schaefer Caroline, Bako Abdulaziz, Nicolas Charlie, Potter Thomas, Caballero Elizabeth, Nair Rejani, Ganduglia-cazaban Cecilia

You have to be authorized to contact abstract author. Please, Login
Not Available

Readers' Comments

We encourage you to enter the discussion by posting your comments and questions below.

Presenters will be notified of your post so that they can respond as appropriate.

This discussion platform is provided to foster engagement, and simulate conversation and knowledge sharing.

 

You have to be authorized to post a comment. Please, Login or Signup.


   Rate this abstract  (Maximum characters: 500)