Analysis of a CT patient dose database with an unsupervised clustering approach

Phys Med. 2019 Apr:60:91-99. doi: 10.1016/j.ejmp.2019.03.015. Epub 2019 Mar 29.

Abstract

Purpose: This study investigated the benefits of implementing a cluster analysis technique to extract relevant information from a computed tomography (CT) dose registry archive.

Methods: A CT patient dose database consisting of about 12,000 examinations and 29,000 single scans collected from three CT systems was interrogated. The database was divided into six subsets according to the equipment and the reference phantoms in the definition of the dose indicators. Hierarchical (single, average, and complete linkage, Ward) and not hierarchical (K-means) clustering methods were implemented using R software. The suitable number of clusters for each CT system was determined by analysing the dendrogram, the within clusters sum of squares, and the cluster content. Summary statistics were produced for each cluster, and the outliers of the dose indicator distribution were investigated.

Results: Ward clustering identified the most common combinations of scanning parameters for each group. The optimal number of clusters for each CT equipment system ranged from 5 to 15. The main diagnostic applications were then extracted from each cluster. Outlier analysis of the dose indicator distribution of each cluster revealed potential improper settings that resulted in increased patient dose.

Conclusions: Clustering methods applied to CT patient dose archives provide a quick and effective overview of the main combinations of currently used exposure parameters and the consequences for dose indicator distributions, also when protocol labels and/or study descriptions are not homogeneous.

Keywords: CT protocol optimization; Cluster analysis; Outlier analysis; Patient dose archive.

MeSH terms

  • Cluster Analysis*
  • Data Mining / methods
  • Databases, Factual
  • Humans
  • Phantoms, Imaging
  • Radiation Dosage*
  • Registries
  • Software
  • Tomography, X-Ray Computed*
  • Unsupervised Machine Learning*