Deep learning approaches have exhibited a great ability on automatic interpretation of the electrocardiogram (ECG). However, large-scale public 12-lead ECG data are still limited, and the diagnostic labels are not uniform, which increases the semantic gap between clinical practice. In this study, we present a large-scale multi-label 12-lead ECG database with standardized diagnostic statements. The dataset contains 25770 ECG records from 24666 patients, which were acquired from Shandong Provincial Hospital (SPH) between 2019/08 and 2020/08. The record length is between 10 and 60 seconds. The diagnostic statements of all ECG records are in full compliance with the AHA/ACC/HRS recommendations, which aims for the standardization and interpretation of the electrocardiogram, and consist of 44 primary statements and 15 modifiers as per the standard. 46.04% records in the dataset contain ECG abnormalities, and 14.45% records have multiple diagnostic statements. The dataset also contains additional patient demographics.
© 2022. The Author(s).