The risk of shortcutting in deep learning algorithms for medical imaging research

Sci Rep. 2024 Nov 25;14(1):29224. doi: 10.1038/s41598-024-79838-6.

Abstract

While deep learning (DL) offers the compelling ability to detect details beyond human vision, its black-box nature makes it prone to misinterpretation. A key problem is algorithmic shortcutting, where DL models inform their predictions with patterns in the data that are easy to detect algorithmically but potentially misleading. Shortcutting makes it trivial to create models with surprisingly accurate predictions that lack all face validity. This case study shows how easily shortcut learning happens, its danger, how complex it can be, and how hard it is to counter. We use simple ResNet18 convolutional neural networks (CNN) to train models to do two things they should not be able to do: predict which patients avoid consuming refried beans or beer purely by examining their knee X-rays (AUC of 0.63 for refried beans and 0.73 for beer). We then show how these models' abilities are tied to several confounding and latent variables in the image. Moreover, the image features the models use to shortcut cannot merely be removed or adjusted through pre-processing. The end result is that we must raise the threshold for evaluating research using CNNs to proclaim new medical attributes that are present in medical images.

MeSH terms

  • Algorithms*
  • Deep Learning*
  • Diagnostic Imaging / methods
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Neural Networks, Computer*