Correlating enzymatic reactivity for different substrates using transferable data-driven collective variables

Proc Natl Acad Sci U S A. 2024 Dec 3;121(49):e2416621121. doi: 10.1073/pnas.2416621121. Epub 2024 Nov 26.

Abstract

Machine learning (ML) is transforming the investigation of complex biological processes. In enzymatic catalysis, one significant challenge is identifying the reactive conformations (RC) of the enzyme:substrate complex where the substrate assumes a precise arrangement in the active site necessary to initiate a reaction. Traditional methods are hindered by the complexity of the multidimensional free energy landscape involved in the transition from nonreactive to reactive conformations. Here, we applied ML techniques to address this challenge, focusing on human pancreatic α-amylase, a crucial enzyme in type-II diabetes treatment. Using ML-based collective variables (CVs), we correlated the probability of being in a RC with the experimental catalytic activity of several malto-oligosaccharide substrates. Our findings demonstrate a remarkable transferability of these CVs across various compounds, significantly streamlining the modeling process and reducing both computational demand and manual intervention in setting up simulations for new substrates. This approach not only advances our understanding of enzymatic processes but also holds substantial potential for accelerating drug discovery by enabling rapid and accurate evaluation of drug efficacy across different generations of inhibitors.

Keywords: active site and substrate pre-organization; enzyme catalysis; glycolysis; machine learning-based collective variables; transfer learning.

MeSH terms

  • Catalytic Domain
  • Humans
  • Machine Learning*
  • Oligosaccharides / chemistry
  • Oligosaccharides / metabolism
  • Pancreatic alpha-Amylases / chemistry
  • Pancreatic alpha-Amylases / metabolism
  • Protein Conformation
  • Substrate Specificity

Substances

  • Pancreatic alpha-Amylases
  • Oligosaccharides