Prediction of protein folding from amino acid sequence over discrete conformation spaces

Biochemistry. 1991 Apr 30;30(17):4232-7. doi: 10.1021/bi00231a018.

Abstract

Predicting the three-dimensional structure of a protein given only its amino acid sequence is a long-standing goal in computational chemistry. In the thermodynamic approach, one needs a potential function of conformation that resembles the free energy of the real protein to the extent that the global minimum of the potential is attained by the native conformation and no other. In practice, this has never been achieved with certainty because even with greatly simplified representations of the polypeptide chain, there are an astronomical number of local minima to examine. If one chooses instead a protein representation with only a large but manageable number of discrete conformations, then the global preference of the potential for the native can be directly verified. Representing a protein as a walk on a two-dimensional square lattice makes it easy to see that simple functions of the interresidue contacts are sufficient to globally favor a given "native" conformation, as long as it is a compact, globular structure. Explicit representation of the solvent is not required. Another more realistic way to confine the conformational search to a finite set is to draw alternative conformations from fragments of larger proteins having known crystal structure. Then it is possible to construct a simple function of interresidue contacts in three dimensions such that only 8 proteins are required to determine the adjustable parameters, and the native conformations of 37 other proteins are correctly preferred over all alternative conformations. The deduced function favors short-range backbone-backbone contacts regardless of residue type and long-range hydrophobic associations. Interactions over long distances, such as electrostatics, are not required.

Publication types

  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence*
  • Crystallization
  • Molecular Sequence Data
  • Protein Conformation