Rationale and objectives: To evaluate a new approach to establish compliance of segmentation tools with the computed tomography volumetry profile of the Quantitative Imaging Biomarker Alliance (QIBA); and determine the statistical exchangeability between real and simulated lesions through an international challenge.
Materials and methods: The study used an anthropomorphic phantom with 16 embedded physical lesions and 30 patient cases from the Reference Image Database to Evaluate Therapy Response with pathologically confirmed malignancies. Hybrid datasets were generated by virtually inserting simulated lesions corresponding to physical lesions into the phantom datasets using one projection-domain-based method (Method 1), two image-domain insertion methods (Methods 2 and 3), and simulated lesions corresponding to real lesions into the Reference Image Database to Evaluate Therapy Response dataset (using Method 2). The volumes of the real and simulated lesions were compared based on bias (measured mean volume differences between physical and virtually inserted lesions in phantoms as quantified by segmentation algorithms), repeatability, reproducibility, equivalence (phantom phase), and overall QIBA compliance (phantom and clinical phase).
Results: For phantom phase, three of eight groups were fully QIBA compliant, and one was marginally compliant. For compliant groups, the estimated biases were -1.8 ± 1.4%, -2.5 ± 1.1%, -3 ± 1%, -1.8 ± 1.5% (±95% confidence interval). No virtual insertion method showed statistical equivalence to physical insertion in bias equivalence testing using Schuirmann's two one-sided test (±5% equivalence margin). Differences in repeatability and reproducibility across physical and simulated lesions were largely comparable (0.1%-16% and 7%-18% differences, respectively). For clinical phase, 7 of 16 groups were QIBA compliant.
Conclusion: Hybrid datasets yielded conclusions similar to real computed tomography datasets where phantom QIBA compliant was also compliant for hybrid datasets. Some groups deemed compliant for simulated methods, not for physical lesion measurements. The magnitude of this difference was small (<5.4%). While technical performance is not equivalent, they correlate, such that, volumetrically simulated lesions could potentially serve as practical proxies.
Keywords: CT; Hybrid dataset; Lung cancer; Quantitative imaging; Segmentation; Volumetry.
Copyright © 2018 The Association of University Radiologists. All rights reserved.