Background: Accurate identification of hospitalizations for acute exacerbations of chronic obstructive pulmonary disease (AECOPD) within electronic health care records is important for research, public health, and to inform health care utilization and service provision. We aimed to develop a strategy to identify hospitalizations for AECOPD in secondary care data and to investigate the validity of strategies to identify hospitalizations for AECOPD in primary care data.
Methods: We identified patients with chronic obstructive pulmonary disease (COPD) in the Clinical Practice Research Datalink (CPRD) with linked Hospital Episodes Statistics (HES) data. We used discharge summaries for recent hospitalizations for AECOPD to develop a strategy to identify the recording of hospitalizations for AECOPD in HES. We then used the HES strategy as a reference standard to investigate the positive predictive value (PPV) and sensitivity of strategies for identifying AECOPD using general practice CPRD data. We tested two strategies: 1) codes for hospitalization for AECOPD and 2) a code for AECOPD other than hospitalization on the same day as a code for hospitalization due to unspecified reason.
Results: In total, 27,182 patients with COPD were included. Our strategy to identify hospitalizations for AECOPD in HES had a sensitivity of 87.5%. When compared with HES, using a code suggesting hospitalization for AECOPD in CPRD resulted in a PPV of 50.2% (95% confidence interval [CI] 48.5%-51.8%) and a sensitivity of 4.1% (95% CI 3.9%-4.3%). Using a code for AECOPD and a code for hospitalization due to unspecified reason resulted in a PPV of 43.3% (95% CI 42.3%-44.2%) and a sensitivity of 5.4% (95% CI 5.1%-5.7%).
Conclusion: Hospitalization for AECOPD can be identified with high sensitivity in the HES database. The PPV and sensitivity of strategies to identify hospitalizations for AECOPD in primary care data alone are very poor. Primary care data alone should not be used to identify hospitalizations for AECOPD. Instead, researchers should use data that are linked to data from secondary care.
Keywords: COPD; cause-specific hospitalization; hospitalization; linked data; validation.