A large-scale multi-label 12-lead electrocardiogram database with standardized diagnostic statements

Hui Liu; Dan Chen; Da Chen; Xiyu Zhang; Huijie Li; Lipan Bian; Minglei Shu; Yinglong Wang

doi:10.1038/s41597-022-01403-5

A large-scale multi-label 12-lead electrocardiogram database with standardized diagnostic statements

Sci Data. 2022 Jun 7;9(1):272. doi: 10.1038/s41597-022-01403-5.

Authors

Hui Liu¹, Dan Chen^{1

2}, Da Chen¹, Xiyu Zhang², Huijie Li², Lipan Bian¹, Minglei Shu³, Yinglong Wang⁴

Affiliations

¹ Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250014, China.
² Shandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, 250021, China.
³ Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250014, China. shuml@sdas.org.
⁴ Shandong Artificial Intelligence Institute, Qilu University of Technology (Shandong Academy of Sciences), Jinan, 250014, China. wangylscsc@126.com.

Abstract

Deep learning approaches have exhibited a great ability on automatic interpretation of the electrocardiogram (ECG). However, large-scale public 12-lead ECG data are still limited, and the diagnostic labels are not uniform, which increases the semantic gap between clinical practice. In this study, we present a large-scale multi-label 12-lead ECG database with standardized diagnostic statements. The dataset contains 25770 ECG records from 24666 patients, which were acquired from Shandong Provincial Hospital (SPH) between 2019/08 and 2020/08. The record length is between 10 and 60 seconds. The diagnostic statements of all ECG records are in full compliance with the AHA/ACC/HRS recommendations, which aims for the standardization and interpretation of the electrocardiogram, and consist of 44 primary statements and 15 modifiers as per the standard. 46.04% records in the dataset contain ECG abnormalities, and 14.45% records have multiple diagnostic statements. The dataset also contains additional patient demographics.

A large-scale multi-label 12-lead electrocardiogram database with standardized diagnostic statements

Authors

Affiliations

Abstract

Publication types

MeSH terms

Grants and funding