Background: The aim of this study was to evaluate the calibration and discriminatory power of three predictive models of breast cancer risk.
Methods: We included 13,760 women who were first-time participants in the Sabadell-Cerdanyola Breast Cancer Screening Program, in Catalonia, Spain. Projections of risk were obtained at three and five years for invasive cancer using the Gail, Chen and Barlow models. Incidence and mortality data were obtained from the Catalan registries. The calibration and discrimination of the models were assessed using the Hosmer-Lemeshow C statistic, the area under the receiver operating characteristic curve (AUC) and the Harrell's C statistic.
Results: The Gail and Chen models showed good calibration while the Barlow model overestimated the number of cases: the ratio between estimated and observed values at 5 years ranged from 0.86 to 1.55 for the first two models and from 1.82 to 3.44 for the Barlow model. The 5-year projection for the Chen and Barlow models had the highest discrimination, with an AUC around 0.58. The Harrell's C statistic showed very similar values in the 5-year projection for each of the models. Although they passed the calibration test, the Gail and Chen models overestimated the number of cases in some breast density categories.
Conclusions: These models cannot be used as a measure of individual risk in early detection programs to customize screening strategies. The inclusion of longitudinal measures of breast density or other risk factors in joint models of survival and longitudinal data may be a step towards personalized early detection of BC.