Modeling the functional consequences of single residue replacements in bacteriophage f1 gene V protein

Protein Eng Des Sel. 2009 Nov;22(11):665-71. doi: 10.1093/protein/gzp050. Epub 2009 Aug 18.

Abstract

A computational mutagenesis methodology utilizing a four-body, knowledge-based, statistical contact potential is applied toward globally quantifying relative environmental perturbations (residual scores) in bacteriophage f1 gene V protein (GVP) due to single amino acid substitutions. We show that residual scores correlate well with experimentally measured relative changes in protein function upon mutation. Residual scores also distinguish between GVP amino acid positions grouped according to protein structural or functional roles or based on similarities in physicochemical characteristics. For each mutant, the in silico mutagenesis additionally yields local measures of environmental change (EC scores) occurring at every residue position (residual profile) relative to the native protein. Implementation of the random forest (RF) algorithm, utilizing experimental GVP mutants whose feature vector components include EC scores at the mutated position and at six structurally nearest neighbors, correctly classifies mutants based on function with up to 77% cross-validation accuracy while achieving 0.82 area under the receiver operating characteristic curve. A control experiment highlights the effectiveness of mutant feature vector signals, and a variety of learning curves are generated to analyze the impact of GVP mutant data set size on performance measures. An optimally trained RF model is subsequently used for inferring function for all the remaining unexplored GVP mutants.

MeSH terms

  • Amino Acid Sequence
  • Amino Acid Substitution
  • Bacteriophages* / physiology
  • Escherichia coli / growth & development
  • Escherichia coli / virology
  • Models, Biological*
  • Models, Molecular
  • Molecular Sequence Data
  • Protein Conformation
  • Structure-Activity Relationship
  • Viral Proteins / chemistry
  • Viral Proteins / genetics*
  • Viral Proteins / metabolism*

Substances

  • Viral Proteins
  • gene V protein, Enterobacteria phage f1