Small-area estimation for public health surveillance using electronic health record data: reducing the impact of underrepresentation

BMC Public Health. 2022 Aug 9;22(1):1515. doi: 10.1186/s12889-022-13809-2.

Abstract

Background: Electronic Health Record (EHR) data are increasingly being used to monitor population health on account of their timeliness, granularity, and large sample sizes. While EHR data are often sufficient to estimate disease prevalence and trends for large geographic areas, the same accuracy and precision may not carry over for smaller areas that are sparsely represented by non-random samples.

Methods: We developed small-area estimation models using a combination of EHR data drawn from MDPHnet, an EHR-based public health surveillance network in Massachusetts, the American Community Survey, and state hospitalization data. We estimated municipality-specific prevalence rates of asthma, diabetes, hypertension, obesity, and smoking in each of the 351 municipalities in Massachusetts in 2016. Models were compared against Behavioral Risk Factor Surveillance System (BRFSS) state and small area estimates for 2016.

Results: Integrating progressively more variables into prediction models generally reduced mean absolute error (MAE) relative to municipality-level BRFSS small area estimates: asthma (2.24% MAE crude, 1.02% MAE modeled), diabetes (3.13% MAE crude, 3.48% MAE modeled), hypertension (2.60% MAE crude, 1.48% MAE modeled), obesity (4.92% MAE crude, 4.07% MAE modeled), and smoking (5.33% MAE crude, 2.99% MAE modeled). Correlation between modeled estimates and BRFSS estimates for the 13 municipalities in Massachusetts covered by BRFSS's 500 Cities ranged from 81.9% (obesity) to 96.7% (diabetes).

Conclusions: Small-area estimation using EHR data is feasible and generates estimates comparable to BRFSS state and small-area estimates. Integrating EHR data with survey data can provide timely and accurate disease monitoring tools for areas with sparse data coverage.

Keywords: Asthma; Behavioral risk factor surveillance system; Diabetes mellitus; Hypertension; Obesity; Population surveillance; Smoking.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Asthma* / epidemiology
  • Behavioral Risk Factor Surveillance System
  • Diabetes Mellitus* / epidemiology
  • Electronic Health Records
  • Humans
  • Hypertension* / epidemiology
  • Obesity
  • Population Surveillance
  • Prevalence
  • Public Health Surveillance
  • United States