Logo

American Heart Association

  15
  0


Final ID: MP1518

The enhanced potential of electrocardiogram interpretation empowered by a vision–language foundation model

Abstract Body (Do not enter title and authors here): Background
Electrocardiography (ECG) is a cornerstone of cardiovascular disease (CVD) diagnosis, but it faces limitations in spatial resolution and susceptibility to artifacts. Recent advances in vision-language foundation models, such as CLIP, offer potential for enhancing ECG interpretation by aligning multimodal data. This study introduces ECGCLIP, a novel model enabling the diagnosis of a broad spectrum of CVD by integrating ECG waveforms with clinical annotations.
Methods
ECGCLIP was trained on 5 million ECG-image/report pairs from multicenter datasets, annotated by experienced physicians. Using a self-supervised contrastive learning framework, the model aligned ECG signals with textual interpretations. Performance was evaluated on 45 ECG tasks (e.g., arrhythmias, conduction disorders) and 29 echocardiography tasks (e.g., valvular diseases, heart failure) across internal and external validation cohorts, by comparing precision-recall AUC (PRAUC) under varying data regimes.
Results
ECGCLIP achieved PRAUC improvements up to 0.5873 (e.g., AAI pacing: 0.0511 → 0.6384) and 0.5253 for rare conditions (e.g., Wolff-Parkinson-White syndrome at 1% data). Critical conditions like ST-elevation myocardial infarction (STEMI) showed gains of 0.2170 (0.1156 → 0.3326), addressing traditional ECG limitations in ischemia detection. The model enhanced detection of valvular diseases (e.g., mitral stenosis: Δ+0.211) and heart failure (LVEF < 40%: Δ+0.139), with 79% generalizability retention for tricuspid regurgitation in external validation. With only 1% training data, ECGCLIP matched or exceeded full-data baselines (e.g., sinus rhythm PRAUC: 0.9747 vs. 0.9877). Gains in low-incidence diseases (e.g., hyperkalemia: Δ+0.0169) highlighted efficacy in sparse-data scenarios. External validation showed an overall PRAUC improvement of 0.1565, with consistent gains in ventricular pre-excitation (Δ+0.5253) but gaps in structural anomalies (e.g., mitral stenosis external Δ+0.100 vs. internal Δ+0.211), reflecting ECG's dependence on functional sequelae.
Conclusion
ECGCLIP establishes a new paradigm for ECG interpretation by leveraging vision-language foundation models. It demonstrates exceptional data efficiency, enabling accurate diagnosis of diverse cardiac conditions—including rare and critical diseases—with minimal labeled data. The model's robustness across diseases supports its potential for deployment in resource-limited settings, enhancing accessibility and precision in CVD care.
  • Yu, Ziqing  ( Zhongshan Hospital Fudan University , Shanghai , China )
  • Liang, Yixiu  ( Zhongshan Hospital Fudan University , Shanghai , China )
  • Su, Yangang  ( Zhongshan Hospital Fudan University , Shanghai , China )
  • Ge, Junbo  ( Zhongshan Hospital, Fudan Univ. , Shanghai , China )
  • Author Disclosures:
    ziqing yu: DO NOT have relevant financial relationships | Yixiu Liang: No Answer | Yangang Su: No Answer | Junbo Ge: No Answer
Meeting Info:

Scientific Sessions 2025

2025

New Orleans, Louisiana

Session Info:

Integrating AI with ECG and Physiologic Signals for Multimodal Precision Health

Sunday, 11/09/2025 , 09:15AM - 10:30AM

Moderated Digital Poster Session

More abstracts on this topic:
A Comparison of Anxiety and Depression in Patients with Symptoms of Palpitations Compared to High Risk Arrythmia Patients.

Treuth Mark, Patel Kunnal, Rissmiller Justin, Holdai Veera

9-Year Longitudinal Assessment of the 12-lead Electrocardiogram of Volunteer Firefighters

Bae Alexander, Dzikowicz Dillon, Lai Chi-ju, Brunner Wendy, Krupa Nicole, Carey Mary, Tam Wai Cheong, Yu Yichen

More abstracts from these authors:
You have to be authorized to contact abstract author. Please, Login
Not Available