De novo mutational signature discovery in tumor genomes using SparseSignatures

PLoS Comput Biol. 2021 Jun 28;17(6):e1009119. doi: 10.1371/journal.pcbi.1009119. eCollection 2021 Jun.

Abstract

Cancer is the result of mutagenic processes that can be inferred from tumor genomes by analyzing rate spectra of point mutations, or "mutational signatures". Here we present SparseSignatures, a novel framework to extract signatures from somatic point mutation data. Our approach incorporates a user-specified background signature, employs regularization to reduce noise in non-background signatures, uses cross-validation to identify the number of signatures, and is scalable to large datasets. We show that SparseSignatures outperforms current state-of-the-art methods on simulated data using a variety of standard metrics. We then apply SparseSignatures to whole genome sequences of pancreatic and breast tumors, discovering well-differentiated signatures that are linked to known mutagenic mechanisms and are strongly associated with patient clinical features.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomarkers, Tumor / genetics
  • Breast Neoplasms / classification
  • Breast Neoplasms / genetics
  • Computational Biology
  • Computer Simulation
  • DNA Mutational Analysis / statistics & numerical data*
  • Databases, Genetic / statistics & numerical data
  • Female
  • Genes, BRCA1
  • Genes, BRCA2
  • Genome, Human
  • Humans
  • Neoplasms / genetics*
  • Pancreatic Neoplasms / classification
  • Pancreatic Neoplasms / genetics
  • Point Mutation*
  • Software

Substances

  • Biomarkers, Tumor

Grants and funding

This work was supported by an R01 grant to A.S. (NIH/NCI) and gift funding from the BRCA Foundation. A.L. was supported by a Young Investigator Award from the BRCA Foundation. D.R. was partially supported by a Bicocca 2020 Starting Grant and by a Premio Giovani Talenti dell'Università degli Studi di Milano-Bicocca. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.