Logo

American Heart Association

  18
  0


Final ID: MP465

Performance Benchmarking of Smaller Language Models Against GPT-4 for Predicting Reasons for Oral Anticoagulation Nonprescription in Atrial Fibrillation

Abstract Body (Do not enter title and authors here): Background:
Oral anticoagulation (OAC) reduces stroke risk in atrial fibrillation (AF), yet nonprescription rates approach 50% with poorly characterized reasons. Proprietary large language models (LLMs) like GPT-4 can identify documented reasons for OAC nonprescription from clinical notes but present cost and privacy barriers to widespread deployment. We investigate whether smaller, open-source LLMs (Gemma-2-9B-IT, Phi-4K) can achieve comparable performance.
Hypothesis:
Open-source LLMs can match the performance of GPT-4 using augmented techniques like chain-of-thought (CoT) prompting.
Methods:
We identified all patient encounters with clinician-billed ICD10 AF diagnosis codes at Stanford Health Care from January 1, 2015 through December 31, 2023. Three reviewers annotated 10% of AF-related note excerpts to identify OAC nonprescription reasons. We developed zero-shot prompts for GPT-4, Gemma-2-9B-IT, and Phi-4K, plus CoT prompts for the open-source models (Graphic 1). Performance was assessed using weighted macro-F1 scores.
Results:
Of 35,737 AF encounters, 7,712 (21.6%) lacked active OAC prescriptions. From 9,143 associated notes, we extracted 21,573 AF/OAC-related excerpts, with 10% (911 notes, 2,175 excerpts) manually annotated. Reasons for nonprescription appeared in 497 (54.6%) notes, most commonly antiplatelet use (18.6%), perceived contraindication (14.7%), and low AF burden (13.9%). Gemma-2-9B-IT with CoT achieved the highest average macro-F1 score (0.81), versus GPT-4 (0.80), Gemma-2-9B-IT (0.76), Phi-4-14B (0.71), and Phi-4-14B with CoT (0.68). Gemma-2-9B-IT with CoT outperformed others in four categories (perceived contraindication, low stroke risk, low AF burden, already on OAC), while GPT-4 performed best for patient preference and antiplatelet alternatives, and Gemma-2-9B-IT for history of AF ablation (Graphic 2).
Conclusions:
Gemma-2-9B-IT, an open-source LLM, effectively categorized OAC nonprescription reasons comparable to GPT-4. This demonstrates that much smaller, freely available, and privacy preserving LLMs can identify barriers to guideline-directed AF care and be deployed across health systems to help reduce care gaps in OAC prescriptions.
  • Somani, Sulaiman  ( Stanford Health Care , Menlo Park , California , United States )
  • Kim, Dale  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Perez Guerrero, Eduardo  ( Stanford University , Stanford , California , United States )
  • Ngo, Summer  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Nguyen, Minh  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Sandhu, Alexander  ( Stanford University , Millbrae , California , United States )
  • Alsentzer, Emily  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Hernandez-boussard, Tina  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Rodriguez, Fatima  ( STANFORD UNIVERSITY , Palo Alto , California , United States )
  • Author Disclosures:
    Sulaiman Somani: DO NOT have relevant financial relationships | Dale Kim: DO NOT have relevant financial relationships | Eduardo Perez Guerrero: DO NOT have relevant financial relationships | Summer Ngo: DO NOT have relevant financial relationships | Minh Nguyen: DO NOT have relevant financial relationships | Alexander Sandhu: DO have relevant financial relationships ; Consultant:Reprieve Cardiovascular:Active (exists now) ; Consultant:Clearly:Active (exists now) ; Research Funding (PI or named investigator):NOVO NORDISK:Active (exists now) ; Research Funding (PI or named investigator):Novartis:Active (exists now) ; Research Funding (PI or named investigator):Bayer:Active (exists now) ; Research Funding (PI or named investigator):Astra Zeneca:Active (exists now) | Emily Alsentzer: DO have relevant financial relationships ; Consultant:Fourier Health:Active (exists now) | Tina Hernandez-Boussard: No Answer | Fatima Rodriguez: DO have relevant financial relationships ; Consultant:HealthPals:Past (completed) ; Consultant:Cleerly Health:Active (exists now) ; Consultant:Amgen:Active (exists now) ; Consultant:iRhythm:Active (exists now) ; Consultant:HeartFlow:Active (exists now) ; Consultant:Arrowhead Pharmaceuticals:Active (exists now) ; Consultant:Edwards:Active (exists now) ; Consultant:Inclusive Health:Active (exists now) ; Consultant:Esperion Therapeutics:Past (completed) ; Consultant:Kento Health:Active (exists now) ; Consultant:Movano Health:Active (exists now) ; Consultant:NovoNordisk:Past (completed) ; Consultant:Novartis:Active (exists now)
Meeting Info:

Scientific Sessions 2025

2025

New Orleans, Louisiana

Session Info:

Arrhythmias Unplugged: Equity, Innovation, and Risk in the Real World

Saturday, 11/08/2025 , 09:15AM - 10:30AM

Moderated Digital Poster Session

More abstracts on this topic:
A novel risk score predicts the prevalence of left atrial low-voltage areas and rhythm outcome in patients undergoing long-standing persistent atrial fibrillation ablation

Ooka Hirotaka, Nakao Sho, Kusuda Masaya, Ariyasu Wataru, Kudo Satoshi, Fujii Subaru, Mano Toshiaki, Matsuda Yasuhiro, Masuda Masaharu, Okamoto Shin, Ishihara Takayuki, Nanto Kiyonori, Tsujimura Takuya, Hata Yosuke, Uematsu Hiroyuki

Global, Regional, and National Trends in Atrial Fibrillation/Flutter Among Overweight and Obese Individuals: A Comparative Analysis of Gender, Age, and Risk Factors (1990-2019)

Xiao Yichao, Kang Yujie, Lan Zehao, Lv Zhaohua, Liu Qiming, Zhou Shenghua

More abstracts from these authors:
Phenotyping Lipid-Lowering Therapies for Patients with ASCVD

Somani Sulaiman, Kim Dale, Ngo Summer, King Sara, Chen Tania, Hernandez-boussard Tina, Rodriguez Fatima

Large Language Models to Understand Reasons for Anticoagulation Nonprescription in Atrial Fibrillation

Somani Sulaiman, Kim Dale, Perez Eduardo, Ngo Summer, Hernandez-boussard Tina, Rodriguez Fatima

You have to be authorized to contact abstract author. Please, Login
Not Available