Objectives: The objective of this study was to evaluate the performance of goodness-of-fit testing to detect relevant violations of the assumptions underlying the criticized "standard" two-class latent class model. Often used to obtain sensitivity and specificity estimates for diagnostic tests in the absence of a gold reference standard, this model relies on assuming that diagnostic test errors are independent. When this assumption is violated, accuracy estimates may be biased: goodness-of-fit testing is often used to evaluate the assumption and prevent bias.
Study design and setting: We investigate the performance of goodness-of-fit testing by Monte Carlo simulation. The simulation scenarios are based on three empirical examples.
Results: Goodness-of-fit tests lack power to detect relevant misfit of the standard two-class latent class model at sample sizes that are typically found in empirical diagnostic studies. The goodness-of-fit tests that are based on asymptotic theory are not robust to the sparseness of data. A parametric bootstrap procedure improves the evaluation of goodness of fit in the case of sparse data.
Conclusion: Our simulation study suggests that relevant violation of the local independence assumption underlying the standard two-class latent class model may remain undetected in empirical diagnostic studies, potentially leading to biased estimates of sensitivity and specificity.
Keywords: Goodness of fit; Latent class analysis; Local independence assumption; No gold standard; Sensitivity and specificity; Simulation.
Copyright © 2015 Elsevier Inc. All rights reserved.