It is frequently of interest to estimate the time that individuals survive with a disease, that is, to estimate the time between disease onset and occurrence of a clinical endpoint such as death. Epidemiologic survival data are commonly collected from either an incident cohort, whose members' disease onset occurs after the study baseline date, or from a cohort with prevalent disease that is followed forward in time. Incident cohort survival data are limited by study termination, while prevalent cohort data provide biased (left-truncated) survival data. In this article, we investigate the advantages of a study design featuring simultaneous follow-up of prevalent and incident cohorts to the estimation of the survivor function. Our analyses are supported by simulations and illustrated using data on survival after myotonic dystrophy diagnosis from the United Kingdom Clinical Practice Research Datalink (CPRD). We demonstrate that the NPMLE using combined incident and prevalent cohort data estimates the true survivor function very well, even for moderate sample sizes, and ameliorates the disadvantages of using a purely incident or prevalent cohort.
Keywords: Canadian longitudinal study on aging; UK clinical practice research datalink; delayed entry; incident cohort; left truncation; myotonic dystrophy; prevalent cohort; survival analysis.