Probabilistic Mixture Models Improve Calibration of Panel-derived Tumor Mutational Burden in the Context of both Tumor-normal and Tumor-only Sequencing

Cancer Res Commun. 2023 Mar 28;3(3):501-509. doi: 10.1158/2767-9764.CRC-22-0339. eCollection 2023 Mar.

Abstract

Background: Tumor mutational burden (TMB) has been investigated as a biomarker for immune checkpoint blockade (ICB) therapy. Increasingly, TMB is being estimated with gene panel-based assays (as opposed to full exome sequencing) and different gene panels cover overlapping but distinct genomic coordinates, making comparisons across panels difficult. Previous studies have suggested that standardization and calibration to exome-derived TMB be done for each panel to ensure comparability. With TMB cutoffs being developed from panel-based assays, there is a need to understand how to properly estimate exomic TMB values from different panel-based assays.

Design: Our approach to calibration of panel-derived TMB to exomic TMB proposes the use of probabilistic mixture models that allow for nonlinear relationships along with heteroscedastic error. We examined various inputs including nonsynonymous, synonymous, and hotspot counts along with genetic ancestry. Using The Cancer Genome Atlas cohort, we generated a tumor-only version of the panel-restricted data by reintroducing private germline variants.

Results: We were able to model more accurately the distribution of both tumor-normal and tumor-only data using the proposed probabilistic mixture models as compared with linear regression. Applying a model trained on tumor-normal data to tumor-only input results in biased TMB predictions. Including synonymous mutations resulted in better regression metrics across both data types, but ultimately a model able to dynamically weight the various input mutation types exhibited optimal performance. Including genetic ancestry improved model performance only in the context of tumor-only data, wherein private germline variants are observed.

Significance: A probabilistic mixture model better models the nonlinearity and heteroscedasticity of the data as compared with linear regression. Tumor-only panel data are needed to properly calibrate tumor-only panels to exomic TMB. Leveraging the uncertainty of point estimates from these models better informs cohort stratification in terms of TMB.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics
  • Calibration
  • Genomics
  • Humans
  • Mutation
  • Neoplasms* / genetics

Substances

  • Biomarkers, Tumor