A hybrid explainable ensemble transformer encoder for pneumonia identification from chest X-ray images

J Adv Res. 2023 Jun:48:191-211. doi: 10.1016/j.jare.2022.08.021. Epub 2022 Sep 7.

Abstract

Introduction: Pneumonia is a microorganism infection that causes chronic inflammation of the human lung cells. Chest X-ray imaging is the most well-known screening approach used for detecting pneumonia in the early stages. While chest-Xray images are mostly blurry with low illumination, a strong feature extraction approach is required for promising identification performance.

Objectives: A new hybrid explainable deep learning framework is proposed for accurate pneumonia disease identification using chest X-ray images.

Methods: The proposed hybrid workflow is developed by fusing the capabilities of both ensemble convolutional networks and the Transformer Encoder mechanism. The ensemble learning backbone is used to extract strong features from the raw input X-ray images in two different scenarios: ensemble A (i.e., DenseNet201, VGG16, and GoogleNet) and ensemble B (i.e., DenseNet201, InceptionResNetV2, and Xception). Whereas, the Transformer Encoder is built based on the self-attention mechanism with multilayer perceptron (MLP) for accurate disease identification. The visual explainable saliency maps are derived to emphasize the crucial predicted regions on the input X-ray images. The end-to-end training process of the proposed deep learning models over all scenarios is performed for binary and multi-class classification scenarios.

Results: The proposed hybrid deep learning model recorded 99.21% classification performance in terms of overall accuracy and F1-score for the binary classification task, while it achieved 98.19% accuracy and 97.29% F1-score for multi-classification task. For the ensemble binary identification scenario, ensemble A recorded 97.22% accuracy and 97.14% F1-score, while ensemble B achieved 96.44% for both accuracy and F1-score. For the ensemble multiclass identification scenario, ensemble A recorded 97.2% accuracy and 95.8% F1-score, while ensemble B recorded 96.4% accuracy and 94.9% F1-score.

Conclusion: The proposed hybrid deep learning framework could provide promising and encouraging explainable identification performance comparing with the individual, ensemble models, or even the latest AI models in the literature. The code is available here: https://github.com/chiagoziemchima/Pneumonia_Identificaton.

Keywords: Chest X-ray imaging; Explainable artificial intelligence (XAI); Pneumonia identification; Self-attention network; Transfer ensemble learning; Transformer encoder (TE).

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Electric Power Supplies
  • Humans
  • Inflammation
  • Pneumonia* / diagnostic imaging
  • Thorax
  • X-Rays