The effects of physics-based data augmentation on the generalizability of deep neural networks: Demonstration on nodule false-positive reduction

Med Phys. 2019 Oct;46(10):4563-4574. doi: 10.1002/mp.13755. Epub 2019 Aug 27.

Abstract

Purpose: An important challenge for deep learning models is generalizing to new datasets that may be acquired with acquisition protocols different from the training set. It is not always feasible to expand training data to the range encountered in clinical practice. We introduce a new technique, physics-based data augmentation (PBDA), that can emulate new computed tomography (CT) data acquisition protocols. We demonstrate two forms of PBDA, emulating increases in slice thickness and reductions of dose, on the specific problem of false-positive reduction in the automatic detection of lung nodules.

Methods: We worked with CT images from the lung image database consortium (LIDC) collection. We employed a hybrid ensemble convolutional neural network (CNN), which consists of multiple CNN modules (VGG, DenseNet, ResNet), for a classification task of determining whether an image patch was a suspicious nodule or a false positive. To emulate a reduction in tube current, we injected noise by simulating forward projection, noise addition, and backprojection corresponding to 1.5 mAs (a "chest x-ray" dose). To simulate thick slice CT scans from thin slice CT scans, we grouped and averaged spatially contiguous CT within thin slice data. The neural network was trained with 10% of the LIDC dataset that was selected to have either the highest tube current or the thinnest slices. The network was tested on the remaining data. We compared PBDA to a baseline with standard geometric augmentations (such as shifts and rotations) and Gaussian noise addition.

Results: PBDA improved the performance of the networks when generalizing to the test dataset in a limited number of cases. We found that the best performance was obtained by applying augmentation at very low doses (1.5 mAs), about an order of magnitude less than most screening protocols. In the baseline augmentation, a comparable level of Gaussian noise was injected. For dose reduction PBDA, the average sensitivity of 0.931 for the hybrid ensemble network was not statistically different from the average sensitivity of 0.935 without PBDA. Similarly for slice thickness PBDA, the average sensitivity of 0.900 when augmenting with doubled simulated slice thicknesses was not statistically different from the average sensitivity of 0.895 without PBDA. While there were cases detailed in this paper in which we observed improvements, the overall picture was one that suggests PBDA may not be an effective data enrichment tool.

Conclusions: PBDA is a newly proposed strategy for mitigating the performance loss of neural networks related to the variation of acquisition protocol between the training dataset and the data that is encountered in deployment or testing. We found that PBDA does not provide robust improvements with the four neural networks (three modules and the ensemble) tested and for the specific task of false-positive reduction in nodule detection.

Keywords: data augmentation; ensemble CNN; false-positive reduction; lung CT; lung nodule detection.

MeSH terms

  • Deep Learning*
  • False Positive Reactions
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Lung Neoplasms / diagnostic imaging*
  • Normal Distribution
  • Radiation Dosage
  • Sensitivity and Specificity
  • Tomography, X-Ray Computed*