Logo

American Heart Association

  117
  0


Final ID: Su3127

Performance of a Popular Large Language Model in Answering Cardiovascular Related Queries: A Systematic Review and Pooled-Analysis

Abstract Body (Do not enter title and authors here): Background: The integration of large language models (LLMs) such as ChatGPT into healthcare can have significant implications for patient education and clinical decision-making.

Aims: This systematic review and pooled analysis aim to evaluate the accuracy of ChatGPT 3.5 and 4 in answering simple queries across cardiovascular (CV) medicine disciplines.

Methods: Literature searches were conducted in PubMed, Embase, and Cochrane Central in May 2024. Keywords included “ChatGPT”, “LLMs”, and “Chat-based artificial intelligence models”. Cross-sectional, peer-reviewed studies published in 2023 and 2024 investigating ChatGPT’s performance in CV medicine-related queries (Table/Figure) were extracted and included. All queries were evaluated by expert physicians in the corresponding fields within each study (and not by our readjudication), and a standardized grading system was employed for pooled analysis using an "accurate" and "inaccurate" grading scale for each answer.

Results: Out of 127 identified and screened peer-reviewed studies, fourteen studies involving 542 CV-related queries were included. Pooled analysis revealed an overall accuracy of 84.5% (458/542) (95% CI [81.5, 87.6]). Stratification by model (ChatGPT-4 vs. ChatGPT-3.5) did not show a significant difference in accuracy (p=0.32). Furthermore, no significant differences in accuracies were seen between answers in 2023 and 2024 (p=0.07). The accuracies across the various topics were statistically comparable, except in the field of cardio-oncology, which showed significantly lower accuracy at 68% (p=0.02). Detailed performances per topic are included in the table and figure.

Conclusion: ChatGPT demonstrated consistently high accuracy in answering CV-related queries with no significant differences across model versions or years. These results support the potential use of online-chat based LLMs as an informational tool in cardiology.
  • Kassab, Joseph  ( Cleveland Clinic Foundation , Cleveland , Ohio , United States )
  • El Hajjar, Abdel Hadi  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Haroun, Elio  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Kanj, Mohamed  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Sarraju, Ashish  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Xlaffinx, Xlukex  ( Cleveland Clinic Foundation , Cleveland , Ohio , United States )
  • Kapadia, Samir  ( CLEVELAND CLINIC , Orae , Ohio , United States )
  • Harb, Serge  ( CLEVELAND CLINIC , Cleveland , Ohio , United States )
  • Author Disclosures:
    Joseph Kassab: DO NOT have relevant financial relationships | Abdel Hadi El Hajjar: DO NOT have relevant financial relationships | Elio Haroun: DO NOT have relevant financial relationships | Mohamed Kanj: DO NOT have relevant financial relationships | Ashish Sarraju: No Answer | xLukex xLaffinx: DO have relevant financial relationships ; Consultant:Medtronic:Active (exists now) ; Consultant:Crispr Therapeutics:Active (exists now) ; Research Funding (PI or named investigator):Arrowhead Pharmaceuticals:Active (exists now) ; Consultant:Idorsia:Active (exists now) ; Consultant:Veradermics:Active (exists now) ; Advisor:Gordy Health:Past (completed) ; Advisor:LucidAct Health:Past (completed) ; Research Funding (PI or named investigator):Mineralys:Active (exists now) ; Research Funding (PI or named investigator):Astrazeneca:Active (exists now) ; Consultant:Lilly:Active (exists now) ; Speaker:Recor:Active (exists now) | Samir Kapadia: DO NOT have relevant financial relationships | Serge Harb: No Answer
Meeting Info:

Scientific Sessions 2024

2024

Chicago, Illinois

Session Info:

LLMs Friend or Foe?

Sunday, 11/17/2024 , 03:15PM - 04:15PM

Abstract Poster Session

More abstracts on this topic:

A blood test based on RNA-seq and machine learning for the detection of steatotic liver disease: A Pilot Study on Cardiometabolic Health

Poggio Rosana, Berdiñas Ignacio, La Greca Alejandro, Luzzani Carlos, Miriuka Santiago, Rodriguez-granillo Gaston, De Lillo Florencia, Rubilar Bibiana, Hijazi Razan, Solari Claudia, Rodríguez Varela María Soledad, Mobbs Alan, Manchini Estefania

A ChatGLM-based stroke diagnosis and prediction tool

Song Xiaowei, Wang Jiayi, Ma Weizhi, Wu Jian, Wang Yueming, Gao Ceshu, Wei Chenming, Pi Jingtao

More abstracts from these authors:
Troponin Testing Trends in US Emergency Departments

Kassab Joseph, Harb Serge, Kapadia Samir

Sodium-Glucose Cotransporter-2 Inhibitors in Moderate Aortic Stenosis

Kassab Joseph, Kapadia Samir, Harb Serge

You have to be authorized to contact abstract author. Please, Login
Not Available