Deep learning networks on chronic liver disease assessment with fine-tuning of shear wave elastography image sequences

Phys Med Biol. 2020 Nov 5;65(21):215027. doi: 10.1088/1361-6560/abae06.

Abstract

Chronic liver disease (CLD) is currently one of the major causes of death worldwide. If not treated, it may lead to cirrhosis, hepatic carcinoma and death. Ultrasound (US) shear wave elastography (SWE) is a relatively new, popular, non-invasive technique among radiologists. Although many studies have been published validating the SWE technique either in a clinical setting, or by applying machine learning on SWE elastograms, minimal work has been done on comparing the performance of popular pre-trained deep learning networks on CLD assessment. Currently available literature reports suggest technical advancements on specific deep learning structures, with specific inputs and usually on a limited CLD fibrosis stage class group, with limited comparison on competitive deep learning schemes fed with different input types. The aim of the present study is to compare some popular deep learning pre-trained networks using temporally stable and full elastograms, with or without augmentation as well as propose suitable deep learning schemes for CLD diagnosis and progress assessment. 200 liver biopsy validated patients with CLD, underwent US SWE examination. Four images from the same liver area were saved to extract elastograms and processed to exclude areas that were temporally unstable. Then, full and temporally stable masked elastograms for each patient were separately fed into GoogLeNet, AlexNet, VGG16, ResNet50 and DenseNet201 with and without augmentation. The networks were tested for differentiation of CLD stages in seven classification schemes over 30 repetitions using liver biopsy as the reference. All networks achieved maximum mean accuracies ranging from 87.2%-97.4% and area under the receiver operating characteristic curves (AUCs) ranging from 0.979-0.990 while the radiologists had AUCs ranging from 0.800-0.870. ResNet50 and DenseNet201 had better average performance than the other networks. The use of the temporal stability mask led to improved performance on about 50% of inputs and network combinations while augmentation led to lower performance for all networks. These findings can provide potential networks with higher accuracy and better setting in the CLD diagnosis and progress assessment. A larger data set would help identify the best network and settings for CLD assessment in clinical practice.

MeSH terms

  • Biopsy
  • Chronic Disease
  • Deep Learning*
  • Elasticity Imaging Techniques*
  • Female
  • Humans
  • Image Processing, Computer-Assisted / methods*
  • Liver Diseases / diagnostic imaging*
  • Liver Diseases / pathology
  • Male
  • Middle Aged
  • ROC Curve