Background: Electronic health records (EHRs) provide researchers with abundant sample sizes, detailed clinical data, and other advantages for performing high-quality observational health research on diverse populations. We review and demonstrate strategies for the design and analysis of cohort studies on neighborhood diversity and health, including evaluation of the effects of race, ethnicity, and neighborhood socioeconomic position on disease prevalence and health outcomes, using localized EHR data.
Methods: Design strategies include integrating and harmonizing EHR data across multiple local health systems and defining the population(s) of interest and cohort extraction procedures for a given analysis based on the goal(s) of the study. Analysis strategies address inferential goals, including the mechanistic study of social risks, statistical adjustment for differences in distributions of social and neighborhood-level characteristics between available EHR data and the underlying local population, and inference on individual neighborhoods. We provide analyses of local variation in mortality rates within Cuyahoga County, Ohio.
Results: When the goal of the analysis is to adjust EHR samples to be more representative of local populations, sampling and weighting are effective. Causal mediation analysis can inform effects of racism (through racial residential segregation) on health outcomes. Spatial analysis is appealing for large-scale EHR data as a means for studying heterogeneity among neighborhoods even at a given level of overall neighborhood disadvantage.
Conclusions: The methods described are a starting point for robust EHR-derived cohort analysis of diverse populations. The methods offer opportunities for researchers to pursue detailed analyses of current and historical underlying circumstances of social policy and inequality. Investigators can employ combinations of these methods to achieve greater robustness of results.
Highlights: EHR data are an abundant resource for studying neighborhood diversity and health.When using EHR data for these studies, careful consideration of the goals of the study should be considered in determining cohort specifications and analytic approaches.Causal mediation analysis, stratification, and spatial analysis are effective methods for characterizing social mechanisms and heterogeneity across localized populations.
Keywords: electronic health records; health disparities; neighborhood disadvantage; race and ethnicity; spatial analysis.