Evaluation of Generative Adversarial Networks for High-Resolution Synthetic Image Generation of Circumpapillary Optical Coherence Tomography Images for Glaucoma

JAMA Ophthalmol. 2022 Oct 1;140(10):974-981. doi: 10.1001/jamaophthalmol.2022.3375.

Abstract

Importance: Deep learning (DL) networks require large data sets for training, which can be challenging to collect clinically. Generative models could be used to generate large numbers of synthetic optical coherence tomography (OCT) images to train such DL networks for glaucoma detection.

Objective: To assess whether generative models can synthesize circumpapillary optic nerve head OCT images of normal and glaucomatous eyes and determine the usability of synthetic images for training DL models for glaucoma detection.

Design, setting, and participants: Progressively growing generative adversarial network models were trained to generate circumpapillary OCT scans. Image gradeability and authenticity were evaluated on a clinical set of 100 real and 100 synthetic images by 2 clinical experts. DL networks for glaucoma detection were trained with real or synthetic images and evaluated on independent internal and external test data sets of 140 and 300 real images, respectively.

Main outcomes and measures: Evaluations of the clinical set between the experts were compared. Glaucoma detection performance of the DL networks was assessed using area under the curve (AUC) analysis. Class activation maps provided visualizations of the regions contributing to the respective classifications.

Results: A total of 990 normal and 862 glaucomatous eyes were analyzed. Evaluations of the clinical set were similar for gradeability (expert 1: 92.0%; expert 2: 93.0%) and authenticity (expert 1: 51.8%; expert 2: 51.3%). The best-performing DL network trained on synthetic images had AUC scores of 0.97 (95% CI, 0.95-0.99) on the internal test data set and 0.90 (95% CI, 0.87-0.93) on the external test data set, compared with AUCs of 0.96 (95% CI, 0.94-0.99) on the internal test data set and 0.84 (95% CI, 0.80-0.87) on the external test data set for the network trained with real images. An increase in the AUC for the synthetic DL network was observed with the use of larger synthetic data set sizes. Class activation maps showed that the regions of the synthetic images contributing to glaucoma detection were generally similar to that of real images.

Conclusions and relevance: DL networks trained with synthetic OCT images for glaucoma detection were comparable with networks trained with real images. These results suggest potential use of generative models in the training of DL networks and as a means of data sharing across institutions without patient information confidentiality issues.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Deep Learning*
  • Glaucoma* / diagnosis
  • Humans
  • Optic Disk* / diagnostic imaging
  • Tomography, Optical Coherence / methods
  • Visual Fields