Evaluating Natural Course Performance in Parametric G-formula: Review of Current Practice and Illustration Based on the United Autoworkers-General Motors Cohort

Am J Epidemiol. 2024 Oct 23:kwae410. doi: 10.1093/aje/kwae410. Online ahead of print.

Abstract

The parametric g-formula is a causal inference method that appropriately adjusts for time-varying confounding affected by prior exposure. Like all parametric methods, it assumes correct model specification, usually assessed by comparing the observed outcome with the simulated outcome under no intervention (natural course). However, it is unclear how to evaluate natural course performance and whether other variables should also be considered. We reviewed current practices for evaluating model misspecification in applications of parametric g-formula. To illustrate the pitfalls of current practices, we then applied the parametric g-formula to examine cardiovascular disease mortality in relation to occupational exposure in the United Autoworkers-General Motors cohort (UAW-GM), comparing 20 parametric model sets and qualitatively assessing natural course performance for all time-varying variables over follow-up. We found that current practices of evaluating model misspecification are often insufficient, increasing risk of bias and statistical cherry picking. Based on our motivational analyses of the UAW-GM cohort, good natural course performance of the outcome does not guarantee good simulations of other covariates; poor predictions of exposures and covariates may still exist. We recommend reporting natural course performance for all time-varying variables at all time-points. Objective criteria for evaluating model misspecification in parametric g-formula need to be developed.

Keywords: causal inference; g-computation; g-formula; healthy worker effect; natural course; parametric g-formula.