Logo

American Heart Association

  32
  0


Final ID: MP465

Performance Benchmarking of Smaller Language Models Against GPT-4 for Predicting Reasons for Oral Anticoagulation Nonprescription in Atrial Fibrillation

Abstract Body (Do not enter title and authors here): Background:
Oral anticoagulation (OAC) reduces stroke risk in atrial fibrillation (AF), yet nonprescription rates approach 50% with poorly characterized reasons. Proprietary large language models (LLMs) like GPT-4 can identify documented reasons for OAC nonprescription from clinical notes but present cost and privacy barriers to widespread deployment. We investigate whether smaller, open-source LLMs (Gemma-2-9B-IT, Phi-4K) can achieve comparable performance.
Hypothesis:
Open-source LLMs can match the performance of GPT-4 using augmented techniques like chain-of-thought (CoT) prompting.
Methods:
We identified all patient encounters with clinician-billed ICD10 AF diagnosis codes at Stanford Health Care from January 1, 2015 through December 31, 2023. Three reviewers annotated 10% of AF-related note excerpts to identify OAC nonprescription reasons. We developed zero-shot prompts for GPT-4, Gemma-2-9B-IT, and Phi-4K, plus CoT prompts for the open-source models (Graphic 1). Performance was assessed using weighted macro-F1 scores.
Results:
Of 35,737 AF encounters, 7,712 (21.6%) lacked active OAC prescriptions. From 9,143 associated notes, we extracted 21,573 AF/OAC-related excerpts, with 10% (911 notes, 2,175 excerpts) manually annotated. Reasons for nonprescription appeared in 497 (54.6%) notes, most commonly antiplatelet use (18.6%), perceived contraindication (14.7%), and low AF burden (13.9%). Gemma-2-9B-IT with CoT achieved the highest average macro-F1 score (0.81), versus GPT-4 (0.80), Gemma-2-9B-IT (0.76), Phi-4-14B (0.71), and Phi-4-14B with CoT (0.68). Gemma-2-9B-IT with CoT outperformed others in four categories (perceived contraindication, low stroke risk, low AF burden, already on OAC), while GPT-4 performed best for patient preference and antiplatelet alternatives, and Gemma-2-9B-IT for history of AF ablation (Graphic 2).
Conclusions:
Gemma-2-9B-IT, an open-source LLM, effectively categorized OAC nonprescription reasons comparable to GPT-4. This demonstrates that much smaller, freely available, and privacy preserving LLMs can identify barriers to guideline-directed AF care and be deployed across health systems to help reduce care gaps in OAC prescriptions.
  • Somani, Sulaiman  ( Stanford Health Care , Menlo Park , California , United States )
  • Kim, Dale  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Perez Guerrero, Eduardo  ( Stanford University , Stanford , California , United States )
  • Ngo, Summer  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Nguyen, Minh  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Sandhu, Alexander  ( Stanford University , Millbrae , California , United States )
  • Alsentzer, Emily  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Hernandez-boussard, Tina  ( Stanford University , Highlands Ranch , Colorado , United States )
  • Rodriguez, Fatima  ( STANFORD UNIVERSITY , Palo Alto , California , United States )
  • Author Disclosures:
    Sulaiman Somani: DO NOT have relevant financial relationships | Dale Kim: DO NOT have relevant financial relationships | Eduardo Perez Guerrero: DO NOT have relevant financial relationships | Summer Ngo: DO NOT have relevant financial relationships | Minh Nguyen: DO NOT have relevant financial relationships | Alexander Sandhu: DO have relevant financial relationships ; Consultant:Reprieve Cardiovascular:Active (exists now) ; Consultant:Clearly:Active (exists now) ; Research Funding (PI or named investigator):NOVO NORDISK:Active (exists now) ; Research Funding (PI or named investigator):Novartis:Active (exists now) ; Research Funding (PI or named investigator):Bayer:Active (exists now) ; Research Funding (PI or named investigator):Astra Zeneca:Active (exists now) | Emily Alsentzer: DO have relevant financial relationships ; Consultant:Fourier Health:Active (exists now) | Tina Hernandez-Boussard: No Answer | Fatima Rodriguez: DO have relevant financial relationships ; Consultant:HealthPals:Past (completed) ; Consultant:Cleerly Health:Active (exists now) ; Consultant:Amgen:Active (exists now) ; Consultant:iRhythm:Active (exists now) ; Consultant:HeartFlow:Active (exists now) ; Consultant:Arrowhead Pharmaceuticals:Active (exists now) ; Consultant:Edwards:Active (exists now) ; Consultant:Inclusive Health:Active (exists now) ; Consultant:Esperion Therapeutics:Past (completed) ; Consultant:Kento Health:Active (exists now) ; Consultant:Movano Health:Active (exists now) ; Consultant:NovoNordisk:Past (completed) ; Consultant:Novartis:Active (exists now)
Meeting Info:

Scientific Sessions 2025

2025

New Orleans, Louisiana

Session Info:

Arrhythmias Unplugged: Equity, Innovation, and Risk in the Real World

Saturday, 11/08/2025 , 09:15AM - 10:30AM

Moderated Digital Poster Session

More abstracts on this topic:
Artificial Intelligence Improves Detection Sensitivity for Challenging Acute Ischemic Stroke Lesions on Diffusion-weighted Imaging

Jeong Younbeom, Ha Sue Young, Ryu Wi-sun, Kim Beom Joon, Sunwoo Leonard

Global Burden and Trend of Atrial Fibrillation and Flutter in the 27 European Union Countries from 1990-2021: A Systematic analysis for the Global Burden of Disease Study 2021

Patel Lalitkumar, Kochhar Gunjan, Desai Hardik, Vummaneni Siri, Bhalodia Paritaben, Patel Rutvij, Daid Simranpreet Singh, Modi Dhruvi, John Jobby, Rathod Urvashi, Sharma Akhilesh

More abstracts from these authors:
Phenotyping Lipid-Lowering Therapies for Patients with ASCVD

Somani Sulaiman, Kim Dale, Ngo Summer, King Sara, Chen Tania, Hernandez-boussard Tina, Rodriguez Fatima

Large Language Models to Understand Reasons for Anticoagulation Nonprescription in Atrial Fibrillation

Somani Sulaiman, Kim Dale, Perez Eduardo, Ngo Summer, Hernandez-boussard Tina, Rodriguez Fatima

You have to be authorized to contact abstract author. Please, Login
Not Available