Developing an algorithm across integrated healthcare systems to identify a history of cancer using electronic medical records

J Am Med Inform Assoc. 2022 Jun 14;29(7):1217-1224. doi: 10.1093/jamia/ocac044.

Abstract

Objective: Tumor registries in integrated healthcare systems (IHCS) have high precision for identifying incident cancer but often miss recently diagnosed cancers or those diagnosed outside of the IHCS. We developed an algorithm using the electronic medical record (EMR) to identify people with a history of cancer not captured in the tumor registry to identify adults, aged 40-65 years, with no history of cancer.

Materials and methods: The algorithm was developed at Kaiser Permanente Colorado, and then applied to 7 other IHCS. We included tumor registry data, diagnosis and procedure codes, chemotherapy files, oncology encounters, and revenue data to develop the algorithm. Each IHCS adapted the algorithm to their EMR data and calculated sensitivity and specificity to evaluate the algorithm's performance after iterative chart review.

Results: We included data from over 1.26 million eligible people across 8 IHCS; 55 601 (4.4%) were in a tumor registry, and 44848 (3.5%) had a reported cancer not captured in a registry. The common attributes of the final algorithm at each site were diagnosis and procedure codes. The sensitivity of the algorithm at each IHCS was 90.65%-100%, and the specificity was 87.91%-100%.

Discussion: Relying only on tumor registry data would miss nearly half of the identified cancers. Our algorithm was robust and required only minor modifications to adapt to other EMR systems.

Conclusion: This algorithm can identify cancer cases regardless of when the diagnosis occurred and may be useful for a variety of research applications or quality improvement projects around cancer care.

Keywords: algorithm; cancer; electronic health records.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adult
  • Algorithms
  • Data Collection
  • Delivery of Health Care, Integrated*
  • Electronic Health Records
  • Humans
  • Neoplasms* / diagnosis