Logo

American Heart Association

  93
  0


Final ID: MP1528

Hybrid Rule-Based and Large Language Model Framework Extracts Statin-Related Information from Clinical Notes

Abstract Body (Do not enter title and authors here): Background: Atherosclerotic cardiovascular disease is a leading cause of morbidity and mortality; statin therapy reduces risk but adherence is suboptimal. Clinical notes contain details on statin intolerance, contraindications, and patient deferral that structured data miss, yet manual extraction is time-consuming.
Hypothesis: A hybrid AI framework combining rule-based NLP and LLM-based methods can accurately extract statin-related information from clinical notes to inform clinical decision support.
Methods: We developed a three-component framework: (1) a rule-based NLP filter to exclude irrelevant notes, (2) an LLM-based refinement filter to identify notes likely containing relevant information, and (3) an LLM-based multicategory classifier to categorize records into intolerance, contraindications, and deferral. Dataset A (2,000 notes; July 1–August 1, 2024) from adult primary care visits at Vanderbilt University Medical Center (VUMC) was split into training (n = 1,200) and testing (n = 800) subsets for development and evaluation. Dataset B (197,761 notes; August 1–September 1, 2024) was used for retrospective evaluation. Performance metrics included precision, recall, F1, accuracy, and filter-out rate. Patient-level prevalence for each category was measured in Dataset B.
Results: In Dataset A, the rule-based NLP filter excluded 81% of notes while retaining all relevant ones (precision = 1.00). The LLM-based refinement filter achieved precision = 0.973, recall = 0.947, F1 = 0.960, accuracy = 0.996, and a filter-out rate of 95.4% on the testing subset. The multicategory classifier attained F1 scores of 0.99 (intolerance), 0.81 (contraindications), and 0.86 (deferral). In Dataset B, after sequential filtering, 45,253 of 197,761 notes remained; the classifier identified 3,027 patients (6.4%) with documented intolerance, 310 (0.7%) with contraindications, and 1,391 (2.9%) who deferred therapy.
Conclusions: The hybrid AI framework efficiently processes clinical notes, filtering out over 90% of irrelevant records while maintaining high precision for relevant content. This scalable approach enables extraction of actionable statin-related information and has potential to enhance clinical decision support by integrating patient-level insights to optimize statin therapy.
  • Liu, Siru  ( Vanderbilt University Medical Cente , Nashville , Tennessee , United States )
  • Mccoy, Allison  ( Vanderbilt University Medical Cente , Nashville , Tennessee , United States )
  • Wright, Adam  ( Vanderbilt University Medical Cente , Nashville , Tennessee , United States )
  • Author Disclosures:
    Siru Liu: DO NOT have relevant financial relationships | Allison McCoy: DO NOT have relevant financial relationships | Adam Wright: No Answer
Meeting Info:

Scientific Sessions 2025

2025

New Orleans, Louisiana

Session Info:

Transforming Healthcare with Large Language Models and NLP: From Unstructured Data to Clinical Insight

Sunday, 11/09/2025 , 11:50AM - 01:00PM

Moderated Digital Poster Session

More abstracts on this topic:
Deciphering transcriptome and cell-cell communication signature in heart failure with preserved ejection fraction at single cell resolution

Huang Jijun, Zhu Chaoqun, Pham Diana, Kafka Gustavo, Yu Kylie, Surapaneni Vishnu, Xiang Yang, Abel Dale

Associations of Biomarkers with Epicardial Adipose Tissue Volume and Density in Cardiovascular Risk Assessment: Insights from the PROMISE Trial

Ashar Perisa, Hadzic Ibrahim, Kwee Lydia, Langenbach Marcel, Douglas Pamela, Shah Svati, Foldyna Borek

More abstracts from these authors:
You have to be authorized to contact abstract author. Please, Login
Not Available