Item

Machine learning to classify left ventricular hypertrophy using electrocardiographic feature extraction by variational autoencoder

Gupta, Amulya
Harvey, Christopher
DeBauge, Ashley
Shomaji, Sumaiya
Yao, Zijun
Lee, Yongkuk
Noheria, Amit
Citations
Google Scholar:
Altmetric:
Other Names
Location
Time Period
Advisors
Original Date
Digitization Date
Issue Date
2026-03-24
Type
Article
Genre
Keywords
Artificial intelligence,Deep learning,ECG,Electrocardiogram,Left ventricular hypertrophy,LVH,Machine learning,Variational autoencoder
Subjects (LCSH)
Research Projects
Organizational Units
Journal Issue
Citation
Gupta, Amulya & Harvey, Christopher & DeBauge, Ashley & Shomaji, Sumaiya & Yao, Zijun & Noheria, Amit. (2026). Machine learning to classify left ventricular hypertrophy using ECG feature extraction by variational autoencoder. Heart Rhythm O2. DOI: 10.1016/j.hroo.2026.03.019
Abstract
Background Traditional electrocardiographic (ECG) criteria for left ventricular hypertrophy (LVH) have modest diagnostic yield. Objective This study aimed to develop and validate machine learning (ML) models for LVH diagnosis from ECG. Methods ECG calculations (rate, intervals, and axis); R-wave, S-wave, and overall-QRS amplitudes; and QRS voltage-time integrals were obtained from 12-lead, vectorcardiographic X-Y-Z–lead, and 3-dimensional (root-sum-square) ECGs. Deep learning–enabled latent embeddings (30 per ECG) were extracted using a variational autoencoder (pretrained on unselected 1.18 million ECGs) from representative-beat signals. Logistic regression, random forest, light gradient boosted machine (LGBM), residual neural network and multilayered perceptron network models using ECG features (calculations and embeddings) and sex, and a convolutional neural network (CNN) using ECG signals alone were trained to predict LVH (left ventricular mass index, women >95 g/m2; men >115 g/m2) on 482,734 ECG-echocardiogram pairs (±45 days). Area under the receiver operating characteristic curves were reported from a holdout testing set. Results In the testing set (n = 54,984), the area under the receiver operating characteristic curve for LVH classification was higher for ML models using ECG features (LGBM 0.794; multilayered perceptron 0.793; residual neural network 0.795) than the best individual ECG variable (Z-axis QRS voltage-time integral 0.707), the best traditional criterion (Cornell voltage-duration product 0.716), and the CNN using ECG signals (0.788). Among patients without LVH who had a follow-up echocardiogram >1 year later, LGBM false positives, compared with true negatives, had a 3.07-fold higher odds of developing future LVH (95% confidence interval 2.44–3.86; P < .0001). Conclusion ML models are superior to traditional ECG criteria for classifying LVH. Models trained on extracted ECG features, including deep-learning latent space representations, can outperform CNN models trained on ECG signals. © 2026 Heart Rhythm Society.
Table of Contents
Description
This is an open access article under the CC BY license.
Publisher
Elsevier B.V.
Journal
Heart Rhythm O2
Book Title
Series
Digital Collection
Finding Aid URL
Use and Reproduction
Archival Collection
PubMed ID
ISSN
26665018
EISSN
Embedded videos