Purpose: Three generic claims-based algorithms based on the Illness Classification of Diseases (10th revision- ICD-10) codes, French Long-Term Illness (LTI) data, and the Diagnosis Related Group program (DRG) were developed to identify retirees with cancer using data from the French national health insurance information system (Système national des données de santé or SNDS) which covers the entire French population. The present study aimed to calculate the algorithms' performances and to describe false positives and negatives in detail.
Methods: Between 2011 and 2016, data from 7544 participants of the French retired self-employed craftsperson cohort (ESPrI) were first matched to the SNDS data, and then toFrench population-based cancer registries data, used as the gold standard. Performance indicators, such as sensitivity and positive predictive values, were estimated for the three algorithms in a subcohort of ESPrI.
Results: The third algorithm, which combined the LTI and DRG program data, presented the best sensitivities (90.9%-100%) and positive predictive values (58.1%-95.2%) according to cancer sites. The majority of false positives were in fact nearby organ sites (e.g., stomach for esophagus) and carcinoma in situ. Most false negatives were probably due to under declaration of LTI.
Conclusion: Validated algorithms using data from the SNDS can be used for passive epidemiological follow-up for some cancer sites in the ESPrI cohort.
Keywords: administrative health data; cancer; cancer registry; claims-based algorithm; incident case.
© 2023 John Wiley & Sons Ltd.