The tip of the iceberg: challenges of accessing hospital electronic health record data for biological data mining

BioData Min. 2016 Sep 22:9:29. doi: 10.1186/s13040-016-0109-1. eCollection 2016.

Abstract

Modern cohort studies include self-reported measures on disease, behavior and lifestyle, sensor-based observations from mobile phones and wearables, and rich -omics data. Follow-up is often achieved through electronic health record (EHR) linkages across primary and secondary healthcare providers. Historically however, researchers typically only get to see the tip of the iceberg: coded administrative data relating to healthcare claims which mainly record billable diagnoses and procedures. The rich data generated during the clinical pathway remain submerged and inaccessible. While some institutions and initiatives have made good progress in unlocking such deep phenotypic data within their institutional realms, access at scale still remains challenging. Here we outline and discuss the main technical and social challenges associated with accessing these data for data mining and hauling the entire iceberg.

Publication types

  • Editorial