Logo

American Heart Association

  2
  0


Final ID: Su3127

Performance of a Popular Large Language Model in Answering Cardiovascular Related Queries: A Systematic Review and Pooled-Analysis

Abstract Body (Do not enter title and authors here): Background: The integration of large language models (LLMs) such as ChatGPT into healthcare can have significant implications for patient education and clinical decision-making.

Aims: This systematic review and pooled analysis aim to evaluate the accuracy of ChatGPT 3.5 and 4 in answering simple queries across cardiovascular (CV) medicine disciplines.

Methods: Literature searches were conducted in PubMed, Embase, and Cochrane Central in May 2024. Keywords included “ChatGPT”, “LLMs”, and “Chat-based artificial intelligence models”. Cross-sectional, peer-reviewed studies published in 2023 and 2024 investigating ChatGPT’s performance in CV medicine-related queries (Table/Figure) were extracted and included. All queries were evaluated by expert physicians in the corresponding fields within each study (and not by our readjudication), and a standardized grading system was employed for pooled analysis using an "accurate" and "inaccurate" grading scale for each answer.

Results: Out of 127 identified and screened peer-reviewed studies, fourteen studies involving 542 CV-related queries were included. Pooled analysis revealed an overall accuracy of 84.5% (458/542) (95% CI [81.5, 87.6]). Stratification by model (ChatGPT-4 vs. ChatGPT-3.5) did not show a significant difference in accuracy (p=0.32). Furthermore, no significant differences in accuracies were seen between answers in 2023 and 2024 (p=0.07). The accuracies across the various topics were statistically comparable, except in the field of cardio-oncology, which showed significantly lower accuracy at 68% (p=0.02). Detailed performances per topic are included in the table and figure.

Conclusion: ChatGPT demonstrated consistently high accuracy in answering CV-related queries with no significant differences across model versions or years. These results support the potential use of online-chat based LLMs as an informational tool in cardiology.
  • Kassab, Joseph  ( Cleveland Clinic Foundation , Cleveland , Ohio , United States )
  • El Hajjar, Abdel Hadi  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Haroun, Elio  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Kanj, Mohamed  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Sarraju, Ashish  ( Cleveland Clinic , Cleveland , Ohio , United States )
  • Xlaffinx, Xlukex  ( Cleveland Clinic Foundation , Cleveland , Ohio , United States )
  • Kapadia, Samir  ( CLEVELAND CLINIC , Orae , Ohio , United States )
  • Harb, Serge  ( CLEVELAND CLINIC , Cleveland , Ohio , United States )
  • Author Disclosures:
    Joseph Kassab: DO NOT have relevant financial relationships | Abdel Hadi El Hajjar: DO NOT have relevant financial relationships | Elio Haroun: DO NOT have relevant financial relationships | Mohamed Kanj: DO NOT have relevant financial relationships | Ashish Sarraju: No Answer | xLukex xLaffinx: DO have relevant financial relationships ; Consultant:Medtronic:Active (exists now) ; Consultant:Crispr Therapeutics:Active (exists now) ; Research Funding (PI or named investigator):Arrowhead Pharmaceuticals:Active (exists now) ; Consultant:Idorsia:Active (exists now) ; Consultant:Veradermics:Active (exists now) ; Advisor:Gordy Health:Past (completed) ; Advisor:LucidAct Health:Past (completed) ; Research Funding (PI or named investigator):Mineralys:Active (exists now) ; Research Funding (PI or named investigator):Astrazeneca:Active (exists now) ; Consultant:Lilly:Active (exists now) ; Speaker:Recor:Active (exists now) | Samir Kapadia: DO NOT have relevant financial relationships | Serge Harb: No Answer
Meeting Info:

Scientific Sessions 2024

2024

Chicago, Illinois

Session Info:

LLMs Friend or Foe?

Sunday, 11/17/2024 , 03:15PM - 04:15PM

Abstract Poster Session

More abstracts on this topic:
A Chemical Language Model for the Design of De Novo Molecules Targeting the Inhibition of TLR3

Velasco Juan

A large-scale multi-view deep learning-based assessment of left ventricular ejection fraction in echocardiography

Jing Linyuan, Metser Gil, Mawson Thomas, Tat Emily, Jiang Nona, Duffy Eamon, Hahn Rebecca, Homma Shunichi, Haggerty Christopher, Poterucha Timothy, Elias Pierre, Long Aaron, Vanmaanen David, Rocha Daniel, Hartzel Dustin, Kelsey Christopher, Ruhl Jeffrey, Beecy Ashley, Elnabawi Youssef

More abstracts from these authors:
Multi-Modality Imaging Characteristics and Survivals By Aortic Stenosis Subtypes In Patients Undergoing Transcatheter Aortic Valve Replacement

Haroun Elio, Bodi Kashyap, Harb Serge, Popovic Zoran, Rodriguez L, Puri Rishi, Reed Grant, Krishnaswamy Amar, Griffin Brian, Kapadia Samir, Wang Tom Kai Ming, Arockiam Aro Daniela, Dong Tiffany, Agrawal Ankit, Khurana Rishabh, El Dahdah Joseph, Alamer Mohammad, Kassab Joseph, Bhalla Jaideep Singh

Detailed Measurements of Surgical Mitral Valves: Implications for Transcatheter Mitral Valve-in-Valve Interventions

Kassab Joseph, Mohammed Nour, Lababede Nour, El Hajjar Abdel Hadi, West Karl, Unai Shinya, Kapadia Samir, Harb Serge

You have to be authorized to contact abstract author. Please, Login
Not Available