Digital profiling of gene expression from histology images with linearized attention

Nat Commun. 2024 Nov 14;15(1):9886. doi: 10.1038/s41467-024-54182-5.

Abstract

Cancer is a heterogeneous disease requiring costly genetic profiling for better understanding and management. Recent advances in deep learning have enabled cost-effective predictions of genetic alterations from whole slide images (WSIs). While transformers have driven significant progress in non-medical domains, their application to WSIs lags behind due to high model complexity and limited dataset sizes. Here, we introduce SEQUOIA, a linearized transformer model that predicts cancer transcriptomic profiles from WSIs. SEQUOIA is developed using 7584 tumor samples across 16 cancer types, with its generalization capacity validated on two independent cohorts comprising 1368 tumors. Accurately predicted genes are associated with key cancer processes, including inflammatory response, cell cycles and metabolism. Further, we demonstrate the value of SEQUOIA in stratifying the risk of breast cancer recurrence and in resolving spatial gene expression at loco-regional levels. SEQUOIA hence deciphers clinically relevant information from WSIs, opening avenues for personalized cancer management.

MeSH terms

  • Breast Neoplasms / genetics
  • Breast Neoplasms / metabolism
  • Breast Neoplasms / pathology
  • Deep Learning
  • Female
  • Gene Expression Profiling* / methods
  • Gene Expression Regulation, Neoplastic
  • Humans
  • Image Processing, Computer-Assisted / methods
  • Neoplasm Recurrence, Local / genetics
  • Neoplasms / genetics
  • Neoplasms / pathology
  • Transcriptome / genetics