Purpose: To assess the impact of lung segmentation accuracy in an automatic pipeline for quantitative analysis of CT images.
Methods: Four different platforms for automatic lung segmentation based on convolutional neural network (CNN), region-growing technique and atlas-based algorithm were considered. The platforms were tested using CT images of 55 COVID-19 patients with severe lung impairment. Four radiologists assessed the segmentations using a 5-point qualitative score (QS). For each CT series, a manually revised reference segmentation (RS) was obtained. Histogram-based quantitative metrics (QM) were calculated from CT histogram using lung segmentationsfrom all platforms and RS. Dice index (DI) and differences of QMs (ΔQMs) were calculated between RS and other segmentations.
Results: Highest QS and lower ΔQMs values were associated to the CNN algorithm. However, only 45% CNN segmentations were judged to need no or only minimal corrections, and in only 17 cases (31%), automatic segmentations provided RS without manual corrections. Median values of the DI for the four algorithms ranged from 0.993 to 0.904. Significant differences for all QMs calculated between automatic segmentations and RS were found both when data were pooled together and stratified according to QS, indicating a relationship between qualitative and quantitative measurements. The most unstable QM was the histogram 90th percentile, with median ΔQMs values ranging from 10HU and 158HU between different algorithms.
Conclusions: None of tested algorithms provided fully reliable segmentation. Segmentation accuracy impacts differently on different quantitative metrics, and each of them should be individually evaluated according to the purpose of subsequent analyses.
Keywords: COVID-19; Computed tomography; Lung segmentation; QCT; Quantitative imaging; Segmentation algorithms.
Copyright © 2021 Associazione Italiana di Fisica Medica. Published by Elsevier Ltd. All rights reserved.