Integrating a new knowledge organisation system for monoclonal antibodies for therapeutic use authorised in Europe into HeTOP terminology-ontology server

J Biomed Inform. 2023 Apr:140:104325. doi: 10.1016/j.jbi.2023.104325. Epub 2023 Mar 2.

Abstract

Monoclonal antibodies (MAs) are increasingly used in the therapeutic arsenal. Clinical Data Warehouses (CDWs) offer unprecedented opportunities for research on real-word data. The objective of this work is to develop a knowledge organization system on MAs for therapeutic use (MATUs) applicable in Europe to query CDWs from a multi-terminology server (HeTOP). After expert consensus, three main health thesauri were selected: the MeSH thesaurus, the National Cancer Institute thesaurus (NCIt) and the SNOMED CT. These thesauri contain 1,723 MAs concepts, but only 99 (5.7 %) are identified as MATUs. The knowledge organisation system proposed in this article is a six-level hierarchical system according to their main therapeutic target. It includes 193 different concepts organised in a cross lingual terminology server, which will allow the inclusion of semantic extensions. Ninety nine (51.3 %) MATUs concepts and 94 (48.7 %) hierarchical concepts composed the knowledge organisation system. Two separates groups (an expert group and a validation group) carried out the selection, creation and validation processes. Queries identify, for unstructured data, 83 out of 99 (83.8 %) MATUs corresponding to 45,262 patients, 347,035 hospital stays and 427,544 health documents, and for structured data, 61 out of 99 (61.6 %) MATUs corresponding to 9,218 patients, 59,643 hospital stays and 104,737 hospital prescriptions. The volume of data in the CDW demonstrated the potential for using these data in clinical research, although not all MATUs are present in the CDW (16 missing for unstructured data and 38 for structured data). The knowledge organisation system proposed here improves the understanding of MATUs, the quality of queries and helps clinical researchers retrieve relevant medical information. The use of this model in CDW allows for the rapid identification of a large number of patients and health documents, either directly by a MATU of interest (e.g. Rituximab) but also by searching for parent concepts (e.g. Anti-CD20 Monoclonal Antibody).

Keywords: Data warehousing; Monoclonal antibodies; Natural language processing; Thesaurus.

MeSH terms

  • Antibodies, Monoclonal* / therapeutic use
  • Data Warehousing
  • Europe
  • Humans
  • Systematized Nomenclature of Medicine
  • Vocabulary, Controlled*

Substances

  • Antibodies, Monoclonal