Characterizing Intra-Tumor Heterogeneity From Somatic Mutations Without Copy-Neutral Assumption

IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2271-2280. doi: 10.1109/TCBB.2020.2973635. Epub 2021 Dec 8.

Abstract

Bulk samples of the same patient are heterogeneous in nature, comprising of different subpopulations (subclones) of cancer cells. Cells in a tumor subclone are characterized by unique mutational genotype profile. Resolving tumor heterogeneity by estimating the genotypes, cellular proportions and the number of subclones present in the tumor can help in understanding cancer progression and treatment. We present a novel method, ChaClone2, to efficiently deconvolve the observed variant allele fractions (VAFs), with consideration for possible effects from copy number aberrations at the mutation loci. Our method describes a state-space formulation of the feature allocation model, deconvolving the observed VAFs from samples of the same patient into three matrices: subclonal total and variant copy numbers for mutated genes, and proportions of subclones in each sample. We describe an efficient sequential Monte Carlo (SMC) algorithm to estimate these matrices. Extensive simulation shows that the ChaClone2 yields better accuracy when compared with other state-of-the-art methods for addressing similar problem and it offers scalability to large datasets. Also, ChaClone2 features that the model parameter estimates can be refined whenever new mutation data of freshly sequenced genomic locations are available. MATLAB code and datasets are available to download at: https://github.com/moyanre/method2.

MeSH terms

  • Algorithms
  • Bayes Theorem
  • Computational Biology / methods*
  • DNA Copy Number Variations / genetics*
  • Genetic Heterogeneity
  • Humans
  • Monte Carlo Method
  • Mutation / genetics*
  • Neoplasms / genetics*
  • Stochastic Processes