Objective: The objective of this study is to create a repository of computable, technology-agnostic phenotype definitions for the purposes of analysis and automatic cohort identification.
Materials and methods: We selected phenotype definitions from PheKB and excluded definitions that did not use structured data or were not used in published research. We translated these definitions into the Clinical Quality Language (CQL) and Fast Healthcare Interoperability Resources (FHIR) and validated them using code review and automated tests.
Results: A total of 33 phenotype definitions met our inclusion criteria. We developed 40 CQL libraries, 231 value sets, and 347 test cases. To support these test cases, a total of 1624 FHIR resources were created as test data.
Discussion and conclusion: Although a number of challenges were encountered while translating the phenotypes into structured form, such as requiring specialized knowledge, or imprecise, ambiguous, and conflicting language, we have created a repository and a development environment that can be used for future research on computable phenotypes.
Keywords: CQL; EHR-driven phenotyping; FHIR; cohort identification.
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association.