Objective: Data from electronic healthcare records (EHR) can be used to monitor drug safety, but in order to compare and pool data from different EHR databases, the extraction of potential adverse events must be harmonized. In this paper, we describe the procedure used for harmonizing the extraction from eight European EHR databases of five events of interest deemed to be important in pharmacovigilance: acute myocardial infarction (AMI); acute renal failure (ARF); anaphylactic shock (AS); bullous eruption (BE); and rhabdomyolysis (RHABD).
Design: The participating databases comprise general practitioners' medical records and claims for hospitalization and other healthcare services. Clinical information is collected using four different disease terminologies and free text in two different languages. The Unified Medical Language System was used to identify concepts and corresponding codes in each terminology. A common database model was used to share and pool data and verify the semantic basis of the event extraction queries. Feedback from the database holders was obtained at various stages to refine the extraction queries.
Measurements: Standardized and age specific incidence rates (IRs) were calculated to facilitate benchmarking and harmonization of event data extraction across the databases. This was an iterative process.
Results: The study population comprised overall 19 647 445 individuals with a follow-up of 59 929 690 person-years (PYs). Age adjusted IRs for the five events of interest across the databases were as follows: (1) AMI: 60-148/100 000 PYs; (2) ARF: 3-49/100 000 PYs; (3) AS: 2-12/100 000 PYs; (4) BE: 2-17/100 000 PYs; and (5) RHABD: 0.1-8/100 000 PYs.
Conclusions: The iterative harmonization process enabled a more homogeneous identification of events across differently structured databases using different coding based algorithms. This workflow can facilitate transparent and reproducible event extractions and understanding of differences between databases.