Leveraging Natural Language Processing for Echocardiographic Data Extraction in Hypoplastic Left Heart Syndrome AHA Conference Repository

American Heart Association

153

Final ID: Sa2046

Leveraging Natural Language Processing for Echocardiographic Data Extraction in Hypoplastic Left Heart Syndrome

Abstract Body (Do not enter title and authors here): Background:
Automated data mining performs well with discrete data, and efforts are underway to use natural language processing (NLP) for textual data collection. This method has yet to be proven generalizable to echocardiographic data in congenital heart disease.

Aim:
Assess the accuracy of an NLP algorithm relative to manual record abstraction (MRA) in collecting echocardiographic data in a cohort of patients with hypoplastic left heart syndrome (HLHS).

Methods:
Patients with HLHS at our institution between 2007 and 2022 were identified. MRA collected serial echocardiographic data at 5 predefined points between birth and stage II palliation, focused on right ventricle wall motion (RVWM) and tricuspid regurgitation (TR). Separately, an NLP algorithm was created to abstract the same data. We used our institution’s identified clinical research database, the Research Derivative (RD), which follows the Observational Medical Outcomes Partnership (OMOP) Common Data Model. Echocardiogram reports were identified by report type and matching titles. Regular expression matched phrases for RVWM and TR with qualifier phrases of function. Qualifier phrases were mapped to severity scales: 0 (normal) to 6 (severely depressed) for RVWM; 0 (normal) to 5 (severe regurgitation) for TR. The results of the MRA and NLP collection methods were compared using Cohen’s kappa test.

Results:
A total of 707 echocardiograms from 180 patients were compared. The NLP model correctly identified RVWM in 93.9% (664/707) and TR in 82.6% (584/707). There was a high degree of agreement for both features (RVWM: k=0.84 [0.88-0.98]; TR: k=0.93 [0.80-0.89]). The most common reason for a no-match scenario was an absence of descriptive text available in the RD that was available in the electronic health record (EHR); 19% (8/43) and 66% (81/123) of no-match in RVWM and TR respectively. Misidentification by NLP algorithm accounted for 37% (16/43) of RVWM and 15% (19/123) of TR no-matches. These represent the algorithm assigning an incorrect score due to misinterpretation of echocardiogram report language, most commonly those with complex verbose descriptions of features.

Conclusion:
The NLP algorithm demonstrates high proficiency in collecting echocardiogram data when text is available for extraction. Its use is limited in reports containing complex descriptive language. These data suggest its utility as an accompaniment for data abstraction to increase data collection efficiency and reduce MRA errors.

Girvin, Zachary ( Vanderbilt , Nasvhille , Tennessee , United States )
Gangireddy, Srushti ( Vanderbilt , Nasvhille , Tennessee , United States )
Coleman, Andersen ( VANDERBILT CHILDRENS HOSPITAL , Nashville , Tennessee , United States )
Ong, Henry ( Vanderbilt , Nasvhille , Tennessee , United States )
Wei, Wei-qi ( Vanderbilt University Medical Ctr , Nashville , Tennessee , United States )
Kannankeril, Prince ( VANDERBILT CHILDRENS HOSPITAL , Nashville , Tennessee , United States )
Sunthankar, Sudeep ( Vanderbilt , Nashville , Tennessee , United States )

Author Disclosures:

Zachary Girvin:

DO NOT have relevant financial relationships

Srushti Gangireddy:

DO NOT have relevant financial relationships

Andersen Coleman:

DO NOT have relevant financial relationships

Henry Ong:

No Answer

Wei-Qi Wei:

No Answer

Prince Kannankeril:

No Answer

Sudeep Sunthankar:

DO NOT have relevant financial relationships

Meeting Info:

Scientific Sessions 2024

2024

Chicago, Illinois

Session Info:

Pediatric Cardiac Care

Saturday, 11/16/2024 , 02:00PM - 03:00PM

Abstract Poster Session

More abstracts from these authors:

The Genetic Basis of Early Mortality in Neonates with Single Ventricle Disease: An NC-DEFINE Prospective Observational Cohort Study

Balint Brittany, Greskovich Sarah, Moreland Blythe, Gaither Jeffrey, Willoughby Ava, Bigelow Amee, Mcbride Kim, Sunthankar Sudeep, Levin Allison, Kurzlechner Leonie, Kreinbrook Judah, Monaco Gabrielle, Ebangwese Santita, Mitchell Saige, Srour Meredith, Yadav Kanishk, Neuerburg Austin, Farrell Maureen, Carlson Katherine, Sala Angelina, Girvin Zachary, Lancaster Megan, Bair Courtney, Gangireddy Srushti, Jaworski James, Wei Wei-qi, Turek Joseph, Leong Martha Elisabeth, Chiswell Karen, Allen Andrew, Li Jennifer, Garg Vidu, White Peter, Wolfe Rachel, Kannankeril Prince, Hoffman Timothy, Landstrom Andrew, Dominguez Gonzalez Carla, Argueta Portillo Cindy, Mackenzie Duncan, Freedy Katherine, Onorato Angela

Left or Bilateral Cardiac Sympathetic Denervation: Comparison of Antiarrhythmic Efficacy and Complications

Girvin Zachary, Wright Adam, Shoemaker Ben, Bichell David, Shah Ashish, Kannankeril Prince

American Heart Association

Leveraging Natural Language Processing for Echocardiographic Data Extraction in Hypoplastic Left Heart Syndrome

Meeting Info:

Session Info:

More abstracts on this topic:

More abstracts from these authors: