Scaling Monte-Carlo-Based Inference on Antibody and TCR Repertoires

Josiah Couch; Rohit Arora; Jasper Braun; Joesph Kaplinsky; Elliot Hill; Anthony Li; Brett Altschul; Ramy Arnaout

Scaling Monte-Carlo-Based Inference on Antibody and TCR Repertoires

ArXiv [Preprint]. 2023 Dec 19:arXiv:2312.12525v1.

Authors

Josiah Couch¹, Rohit Arora¹, Jasper Braun¹, Joesph Kaplinsky¹, Elliot Hill¹, Anthony Li¹, Brett Altschul², Ramy Arnaout^{1

3}

Affiliations

¹ Department of Pathology, Beth Israel Deaconess Medical Center, Boston, MA 02215.
² Department of Physics and Astronomy, University of South Carolina, Columbia, SC 29208.
³ Harvard Medical School, Boston, MA 02115.

PMID: 38196748
PMCID: PMC10775351

Abstract

Previously, it has been shown that maximum-entropy models of immune-repertoire sequence can be used to determine a person's vaccination status. However, this approach has the drawback of requiring a computationally intensive method to compute each model's partition function $(Z)$ , the normalization constant required for calculating the probability that the model will generate a given sequence. Specifically, the method required generating approximately 10¹⁰ sequences via Monte-Carlo simulations for each model. This is impractical for large numbers of models. Here we propose an alternative method that requires estimating $Z$ this way for only a few models: it then uses these expensive estimates to estimate $Z$ more efficiently for the remaining models. We demonstrate that this new method enables the generation of accurate estimates for 27 models using only three expensive estimates, thereby reducing the computational cost by an order of magnitude. Importantly, this gain in efficiency is achieved with only minimal impact on classification accuracy. Thus, this new method enables larger-scale investigations in computational immunology and represents a useful contribution to energy-based modeling more generally.

Publication types

Preprint

Grants and funding

R01 AI148747/AI/NIAID NIH HHS/United States