ETL Framework for Real-Time Business Intelligence over Medical Imaging Repositories

J Digit Imaging. 2019 Oct;32(5):870-879. doi: 10.1007/s10278-019-00184-5.

Abstract

In the last decades, the amount of medical imaging studies and associated metadata has been rapidly increasing. Despite being mostly used for supporting medical diagnosis and treatment, many recent initiatives claim the use of medical imaging studies in clinical research scenarios but also to improve the business practices of medical institutions. However, the continuous production of medical imaging studies coupled with the tremendous amount of associated data, makes the real-time analysis of medical imaging repositories difficult using conventional tools and methodologies. Those archives contain not only the image data itself but also a wide range of valuable metadata describing all the stakeholders involved in the examination. The exploration of such technologies will increase the efficiency and quality of medical practice. In major centers, it represents a big data scenario where Business Intelligence (BI) and Data Analytics (DA) are rare and implemented through data warehousing approaches. This article proposes an Extract, Transform, Load (ETL) framework for medical imaging repositories able to feed, in real-time, a developed BI (Business Intelligence) application. The solution was designed to provide the necessary environment for leading research on top of live institutional repositories without requesting the creation of a data warehouse. It features an extensible dashboard with customizable charts and reports, with an intuitive web-based interface that empowers the usage of novel data mining techniques, namely, a variety of data cleansing tools, filters, and clustering functions. Therefore, the user is not required to master the programming skills commonly needed for data analysts and scientists, such as Python and R.

Keywords: Big data; Business Intelligence; Cloud; DICOM; Data Analytics; PACS.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Data Mining / methods*
  • Data Mining / statistics & numerical data
  • Data Warehousing / methods*
  • Data Warehousing / statistics & numerical data
  • Humans
  • Metadata / statistics & numerical data*
  • Radiology Information Systems / organization & administration*
  • Radiology Information Systems / statistics & numerical data*