Leveraging Deep Learning of Chest Radiograph Images to Identify Individuals at High Risk for Chronic Obstructive Pulmonary Disease

medRxiv [Preprint]. 2024 Nov 15:2024.11.14.24317055. doi: 10.1101/2024.11.14.24317055.

Abstract

Background: This study assessed whether deep learning applied to routine outpatient chest X-rays (CXRs) can identify individuals at high risk for incident chronic obstructive pulmonary disease (COPD).

Methods: Using cancer screening trial data, we previously developed a convolutional neural network (CXR-Lung-Risk) to predict lung-related mortality from a CXR image. In this study, we externally validated CXR-Lung-Risk to predict incident COPD from routine CXRs. We identified outpatients without lung cancer, COPD, or emphysema who had a CXR taken from 2013-2014 at a Mass General Brigham site in Boston, Massachusetts. The primary outcome was 6-year incident COPD. Discrimination was assessed using AUC compared to the TargetCOPD clinical risk score. All analyses were stratified by smoking status. A secondary analysis was conducted in the Project Baseline Health Study (PBHS) to test associations between CXR-Lung-Risk with pulmonary function and protein abundance.

Findings: The primary analysis consisted of 12,550 ever-smokers (mean age 62·4±6·8 years, 48.9% male, 12.4% rate of 6-year COPD) and 15,298 never-smokers (mean age 63·0±8·1 years, 42.8% male, 3.8% rate of 6-year COPD). CXR-Lung-Risk had additive predictive value beyond the TargetCOPD score for 6-year incident COPD in both ever-smokers (CXR-Lung-Risk + TargetCOPD AUC: 0·73 [95% CI: 0·72-0·74] vs. TargetCOPD alone AUC: 0·66 [0·65-0·68], p<0·01) and never-smokers (CXR-Lung-Risk + TargetCOPD AUC: 0·70 [0·67-0·72] vs. TargetCOPD AUC: 0·60 [0·57-0·62], p<0·01). In secondary analyses of 2,097 individuals in the PBHS, CXR-Lung-Risk was associated with worse pulmonary function and with abundance of SCGB3A2 (secretoglobin family 3A member 2) and LYZ (lysozyme), proteins involved in pulmonary physiology.

Interpretation: In external validation, a deep learning model applied to a routine CXR image identified individuals at high risk for incident COPD, beyond known risk factors.

Funding: The Project Baseline Health Study and this analysis were funded by Verily Life Sciences, San Francisco, California.

Clinicaltrialsgov identifier: NCT03154346.

Publication types

  • Preprint

Associated data

  • ClinicalTrials.gov/NCT03154346