Adapting the open-source Gen3 platform and kubernetes for the NIH HEAL IMPOWR and MIRHIQL clinical trial data commons: Customization, cloud transition, and optimization

J Biomed Inform. 2024 Nov:159:104749. doi: 10.1016/j.jbi.2024.104749. Epub 2024 Nov 6.

Abstract

Objective: This study aims to provide the decision-making framework, strategies, and software used to successfully deploy the first combined chronic pain and opioid use data clinical trial data commons using the Gen3 platform.

Materials and methods: The approach involved adapting the open-source Gen3 platform and Kubernetes for the needs of the NIH HEAL IMPOWR and MIRHIQL networks. Key steps included customizing the Gen3 architecture, transitioning from Amazon to Google Cloud, adapting data ingestion and harmonization processes, ensuring security and compliance for the Kubernetes environment, and optimizing performance and user experience.

Results: The primary result was a fully operational IMPOWR data commons built on Gen3. Key features include a modular architecture supporting diverse clinical trial data types, automated processes for data management, fine-grained access control and auditing, and researcher-friendly interfaces for data exploration and analysis.

Discussion: The successful development of the Wake Forest IDEA-CC data commons represents a significant milestone for chronic pain and addiction research. Harmonized, FAIR data from diverse studies can be discovered in a secure, scalable repository. Challenges remain in long-term maintenance and governance, but the commons provides a foundation for accelerating scientific progress. Key lessons learned include the importance of engaging both technical and domain experts, the need for flexible yet robust infrastructure, and the value of building on established open-source platforms.

Conclusion: The WF IDEA-CC Gen3 data commons demonstrates the feasibility and value of developing a shared data infrastructure for chronic pain and opioid use research. The lessons learned can inform similar efforts in other clinical domains.

Keywords: Chronic pain; Cloud computing; Data commons; Gen3; Kubernetes; Opioid.

MeSH terms

  • Analgesics, Opioid / therapeutic use
  • Chronic Pain / drug therapy
  • Clinical Trials as Topic*
  • Cloud Computing*
  • Humans
  • National Institutes of Health (U.S.)
  • Software*
  • United States

Substances

  • Analgesics, Opioid