Performance of a Popular Large Language Model in Answering Cardiovascular Related Queries: A Systematic Review and Pooled-Analysis
Abstract Body (Do not enter title and authors here): Background: The integration of large language models (LLMs) such as ChatGPT into healthcare can have significant implications for patient education and clinical decision-making.
Aims: This systematic review and pooled analysis aim to evaluate the accuracy of ChatGPT 3.5 and 4 in answering simple queries across cardiovascular (CV) medicine disciplines.
Methods: Literature searches were conducted in PubMed, Embase, and Cochrane Central in May 2024. Keywords included “ChatGPT”, “LLMs”, and “Chat-based artificial intelligence models”. Cross-sectional, peer-reviewed studies published in 2023 and 2024 investigating ChatGPT’s performance in CV medicine-related queries (Table/Figure) were extracted and included. All queries were evaluated by expert physicians in the corresponding fields within each study (and not by our readjudication), and a standardized grading system was employed for pooled analysis using an "accurate" and "inaccurate" grading scale for each answer.
Results: Out of 127 identified and screened peer-reviewed studies, fourteen studies involving 542 CV-related queries were included. Pooled analysis revealed an overall accuracy of 84.5% (458/542) (95% CI [81.5, 87.6]). Stratification by model (ChatGPT-4 vs. ChatGPT-3.5) did not show a significant difference in accuracy (p=0.32). Furthermore, no significant differences in accuracies were seen between answers in 2023 and 2024 (p=0.07). The accuracies across the various topics were statistically comparable, except in the field of cardio-oncology, which showed significantly lower accuracy at 68% (p=0.02). Detailed performances per topic are included in the table and figure.
Conclusion: ChatGPT demonstrated consistently high accuracy in answering CV-related queries with no significant differences across model versions or years. These results support the potential use of online-chat based LLMs as an informational tool in cardiology.
Kassab, Joseph
(
Cleveland Clinic Foundation
, Cleveland , Ohio , United States )
El Hajjar, Abdel Hadi
(
Cleveland Clinic
, Cleveland , Ohio , United States )
Haroun, Elio
(
Cleveland Clinic
, Cleveland , Ohio , United States )
Kanj, Mohamed
(
Cleveland Clinic
, Cleveland , Ohio , United States )
Sarraju, Ashish
(
Cleveland Clinic
, Cleveland , Ohio , United States )
Xlaffinx, Xlukex
(
Cleveland Clinic Foundation
, Cleveland , Ohio , United States )
Kapadia, Samir
(
CLEVELAND CLINIC
, Orae , Ohio , United States )
Harb, Serge
(
CLEVELAND CLINIC
, Cleveland , Ohio , United States )
Author Disclosures:
Joseph Kassab:DO NOT have relevant financial relationships
| Abdel Hadi El Hajjar:DO NOT have relevant financial relationships
| Elio Haroun:DO NOT have relevant financial relationships
| Mohamed Kanj:DO NOT have relevant financial relationships
| Ashish Sarraju:No Answer
| xLukex xLaffinx:DO have relevant financial relationships
;
Consultant:Medtronic:Active (exists now)
; Consultant:Crispr Therapeutics:Active (exists now)
; Research Funding (PI or named investigator):Arrowhead Pharmaceuticals:Active (exists now)
; Consultant:Idorsia:Active (exists now)
; Consultant:Veradermics:Active (exists now)
; Advisor:Gordy Health:Past (completed)
; Advisor:LucidAct Health:Past (completed)
; Research Funding (PI or named investigator):Mineralys:Active (exists now)
; Research Funding (PI or named investigator):Astrazeneca:Active (exists now)
; Consultant:Lilly:Active (exists now)
; Speaker:Recor:Active (exists now)
| Samir Kapadia:DO NOT have relevant financial relationships
| Serge Harb:No Answer