Identifying somatic fingerprints of cancers defined by germline and environmental risk factors

Genet Epidemiol. 2024 Dec;48(8):455-467. doi: 10.1002/gepi.22565. Epub 2024 Apr 30.

Abstract

Numerous studies over the past generation have identified germline variants that increase specific cancer risks. Simultaneously, a revolution in sequencing technology has permitted high-throughput annotations of somatic genomes characterizing individual tumors. However, examining the relationship between germline variants and somatic alteration patterns is hugely challenged by the large numbers of variants in a typical tumor, the rarity of most individual variants, and the heterogeneity of tumor somatic fingerprints. In this article, we propose statistical methodology that frames the investigation of germline-somatic relationships in an interpretable manner. The method uses meta-features embodying biological contexts of individual somatic alterations to implicitly group rare mutations. Our team has used this technique previously through a multilevel regression model to diagnose with high accuracy tumor site of origin. Herein, we further leverage topic models from computational linguistics to achieve interpretable lower-dimensional embeddings of the meta-features. We demonstrate how the method can identify distinctive somatic profiles linked to specific germline variants or environmental risk factors. We illustrate the method using The Cancer Genome Atlas whole-exome sequencing data to characterize somatic tumor fingerprints in breast cancer patients with germline BRCA1/2 mutations and in head and neck cancer patients exposed to human papillomavirus.

Keywords: germline mutations; germline‐somatic associations; meta‐features; multilevel regression modeling; somatic mutations; topic models.

MeSH terms

  • BRCA1 Protein / genetics
  • BRCA2 Protein / genetics
  • Breast Neoplasms* / genetics
  • Breast Neoplasms* / pathology
  • Exome Sequencing
  • Female
  • Gene-Environment Interaction
  • Genetic Predisposition to Disease
  • Germ-Line Mutation*
  • Head and Neck Neoplasms / epidemiology
  • Head and Neck Neoplasms / genetics
  • Humans
  • Neoplasms / genetics
  • Papillomavirus Infections / complications
  • Papillomavirus Infections / genetics
  • Papillomavirus Infections / virology
  • Risk Factors

Substances

  • BRCA1 Protein
  • BRCA1 protein, human
  • BRCA2 Protein
  • BRCA2 protein, human