Impact of missing data due to drop-outs on estimators for rates of change in longitudinal studies: a simulation study

Stat Med. 2001 Dec 30;20(24):3715-28. doi: 10.1002/sim.1114.

Abstract

Many cohort studies and clinical trials are designed to compare rates of change over time in one or more disease markers in several groups. One major problem in such longitudinal studies is missing data due to patient drop-out. The bias and efficiency of six different methods to estimate rates of changes in longitudinal studies with incomplete observations were compared: generalized estimating equation estimates (GEE) proposed by Liang and Zeger (1986); unweighted average of ordinary least squares (OLSE) of individual rates of change (UWLS); weighted average of OLSE (WLS); conditional linear model estimates (CLE), a covariate type estimates proposed by Wu and Bailey (1989); random effect (RE), and joint multivariate RE (JMRE) estimates. The latter method combines a linear RE model for the underlying pattern of the marker with a log-normal survival model for informative drop-out process. The performance of these methods in the presence of missing data completely at random (MCAR), at random (MAR) and non-ignorable (NIM) were compared in simulation studies. Data for the disease marker were generated under the linear random effects model with parameter values derived from realistic examples in HIV infection. Rates of drop-out, assumed to increase over time, were allowed to be independent of marker values or to depend either only on previous marker values or on both previous and current marker values. Under MACR all six methods yielded unbiased estimates of both group mean rates and between-group difference. However, the cross-sectional view of the data in the GEE method resulted in seriously biased estimates under MAR and NIM drop-out process. The bias in the estimates ranged from 30 per cent to 50 per cent. The degree of bias in the GEE estimates increases with the severity of non-randomness and with the proportion of MAR data. Under MCAR and MAR all the other five methods performed relatively well. RE and JMRE estimates were more efficient(that is, had smaller variance) than UWLS, WLS and CL estimates. Under NIM, WLS and particularly RE estimates tended to underestimate the average rate of marker change (bias approximately 10 per cent). Under NIM, UWLS, CL and JMRE performed better in terms of bias (3-5 per cent) with the JMRE giving the most efficient estimates. Given that markers are key variables related to disease progression, missing marker data are likely to be at least MAR. Thus, the GEE method may not be appropriate for analysing such longitudinal marker data. The potential biases due to incomplete data require greater recognition in reports of longitudinal studies. Sensitivity analyses to assess the effect of drop-outs on inferences about the target parameters are important.

Publication types

  • Comparative Study

MeSH terms

  • Adolescent
  • Adult
  • Bias
  • CD4 Lymphocyte Count
  • Child
  • Child, Preschool
  • Cohort Studies
  • Computer Simulation
  • Data Interpretation, Statistical
  • Disease Progression
  • Greece
  • HIV Infections / pathology
  • HIV-1 / growth & development
  • Humans
  • Longitudinal Studies*
  • Male
  • Middle Aged
  • Patient Dropouts*
  • Statistics as Topic / methods*