Using deep-learning predictions of inter-residue distances for model validation

Acta Crystallogr D Struct Biol. 2022 Dec 1;78(Pt 12):1412-1427. doi: 10.1107/S2059798322010415. Epub 2022 Nov 25.

Abstract

Determination of protein structures typically entails building a model that satisfies the collected experimental observations and its deposition in the Protein Data Bank. Experimental limitations can lead to unavoidable uncertainties during the process of model building, which result in the introduction of errors into the deposited model. Many metrics are available for model validation, but most are limited to consideration of the physico-chemical aspects of the model or its match to the experimental data. The latest advances in the field of deep learning have enabled the increasingly accurate prediction of inter-residue distances, an advance which has played a pivotal role in the recent improvements observed in the field of protein ab initio modelling. Here, new validation methods are presented based on the use of these precise inter-residue distance predictions, which are compared with the distances observed in the protein model. Sequence-register errors are particularly clearly detected and the register shifts required for their correction can be reliably determined. The method is available in the ConKit package (https://www.conkit.org).

Keywords: AlphaFold2; ConKit; conkit-validate; inter-residue distances; model validation.

MeSH terms

  • Databases, Protein
  • Deep Learning*