Deep Learning-Based Classification of Early-Stage Mycosis Fungoides and Benign Inflammatory Dermatoses on H&E-Stained Whole-Slide Images: A Retrospective, Proof-of-Concept Study

J Invest Dermatol. 2024 Sep 19:S0022-202X(24)02101-8. doi: 10.1016/j.jid.2024.07.036. Online ahead of print.

Abstract

The diagnosis of early-stage mycosis fungoides (MF) is challenging owing to shared clinical and histopathological features with benign inflammatory dermatoses. Recent evidence has shown that deep learning (DL) can assist pathologists in cancer classification, but this field is largely unexplored for cutaneous lymphomas. This study evaluates DL in distinguishing early-stage MF from benign inflammatory dermatoses using a unique dataset of 924 H&E-stained whole-slide images from skin biopsies, including 233 patients with early-stage MF and 353 patients with benign inflammatory dermatoses. All patients with MF were diagnosed after clinicopathological correlation. The classification accuracy of weakly supervised DL models was benchmarked against 3 expert pathologists. The highest performance on a temporal test set was at ×200 magnification (0.50 μm per pixel resolution), with a mean area under the curve of 0.827 ± 0.044 and a mean balanced accuracy of 76.2 ± 3.9%. This nearly matched the 77.7% mean balanced accuracy of the 3 expert pathologists. Most (63.5%) attention heatmaps corresponded well with the pathologists' region of interest. Considering the difficulty of the MF versus benign inflammatory dermatoses classification task, the results of this study show promise for future applications of weakly supervised DL in diagnosing early-stage MF. Achieving clinical-grade performance will require larger multi-institutional datasets and improved methodologies, such as multimodal DL with incorporation of clinical data.

Keywords: Artificial intelligence; Cutaneous lymphoma; Digital pathology; Histology; Weakly supervised learning.