The custodian administered research extract server: "improving the pipeline" in linked data delivery systems

Health Inf Sci Syst. 2014 Aug 18:2:6. doi: 10.1186/2047-2501-2-6. eCollection 2014.

Abstract

Background: At Western Australia's Data Linkage Branch (DLB) the extraction of linked data has become increasingly complex over the past decade and classical methods of data delivery are unsuited to the larger extractions which have become the norm. The Custodian Administered Research Extract Server (CARES) is a fast, accurate and predictable approach to linked data extraction.

Methods: The Data Linkage Branch (DLB) creates linkage keys within and between datasets. To comply with the separation principal, these keys are sent to applicable data collection agencies for extraction. Routing requests through multiple channels is inefficient and makes it hard to monitor work and predict delivery times. CARES was developed to address these shortcomings and involved ongoing consultation with the Custodians and staff of collections, plus challenges of hardware, programming, governance and security.

Results: The introduction of CARES has reduced the workload burden of linked data extractions, while improving the efficiency, stability and predictability of turnaround times.

Conclusions: As the scope of a linkage system broadens, challenges in data delivery are inevitable. CARES overcomes multiple obstacles with no sacrifice to the integrity, confidentiality or security of data. CARES is a valuable component of linkage infrastructure that is operable at any scale and adaptable to many data environments.