Prediction of gene expression-based breast cancer proliferation scores from histopathology whole slide images using deep learning

BMC Cancer. 2024 Dec 11;24(1):1510. doi: 10.1186/s12885-024-13248-9.

Abstract

Background: In breast cancer, several gene expression assays have been developed to provide a more personalised treatment. This study focuses on the prediction of two molecular proliferation signatures: an 11-gene proliferation score and the MKI67 proliferation marker gene. The aim was to assess whether these could be predicted from digital whole slide images (WSIs) using deep learning models.

Methods: WSIs and RNA-sequencing data from 819 invasive breast cancer patients were included for training, and models were evaluated on an internal test set of 172 cases as well as on 997 cases from a fully independent external test set. Two deep Convolutional Neural Network (CNN) models were optimised using WSIs and gene expression readouts from RNA-sequencing data of either the proliferation signature or the proliferation marker, and assessed using Spearman correlation (r). Prognostic performance was assessed through Cox proportional hazard modelling, estimating hazard ratios (HR).

Results: Optimised CNNs successfully predicted the proliferation score and proliferation marker on the unseen internal test set (ρ = 0.691(p < 0.001) with R2 = 0.438, and ρ = 0.564 (p < 0.001) with R2 = 0.251 respectively) and on the external test set (ρ = 0.502 (p < 0.001) with R2 = 0.319, and ρ = 0.403 (p < 0.001) with R2 = 0.222 respectively). Patients with a high proliferation score or marker were significantly associated with a higher risk of recurrence or death in the external test set (HR = 1.65 (95% CI: 1.05-2.61) and HR = 1.84 (95% CI: 1.17-2.89), respectively).

Conclusions: The results from this study suggest that gene expression levels of proliferation scores can be predicted directly from breast cancer morphology in WSIs using CNNs and that the predictions provide prognostic information that could be used in research as well as in the clinical setting.

Keywords: Artificial intelligence; Breast cancer; Computational pathology; Gene expression; Proliferation.

MeSH terms

  • Adult
  • Aged
  • Biomarkers, Tumor* / genetics
  • Breast Neoplasms* / genetics
  • Breast Neoplasms* / pathology
  • Cell Proliferation* / genetics
  • Deep Learning*
  • Female
  • Gene Expression Profiling / methods
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Ki-67 Antigen / genetics
  • Ki-67 Antigen / metabolism
  • Middle Aged
  • Prognosis

Substances

  • Biomarkers, Tumor
  • Ki-67 Antigen
  • MKI67 protein, human