Classifying metal-binding sites with neural networks

Marjolein Oostrom; Sarah Akers; Noah Garrett; Emma Hanson; Wendy Shaw; Joseph A Laureanti

doi:10.1002/pro.4591

Classifying metal-binding sites with neural networks

Protein Sci. 2023 Mar;32(3):e4591. doi: 10.1002/pro.4591.

Authors

Marjolein Oostrom¹, Sarah Akers¹, Noah Garrett², Emma Hanson², Wendy Shaw², Joseph A Laureanti²

Affiliations

¹ National Security Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA.
² Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington, USA.

Abstract

To advance our ability to predict impacts of the protein scaffold on catalysis, robust classification schemes to define features of proteins that will influence reactivity are needed. One of these features is a protein's metal-binding ability, as metals are critical to catalytic conversion by metalloenzymes. As a step toward realizing this goal, we used convolutional neural networks (CNNs) to enable the classification of a metal cofactor binding pocket within a protein scaffold. CNNs enable images to be classified based on multiple levels of detail in the image, from edges and corners to entire objects, and can provide rapid classification. First, six CNN models were fine-tuned to classify the 20 standard amino acids to choose a performant model for amino acid classification. This model was then trained in two parallel efforts: to classify a 2D image of the environment within a given radius of the central metal binding site, either an Fe ion or a [2Fe-2S] cofactor, with the metal visible (effort 1) or the metal hidden (effort 2). We further used two sub-classifications of the [2Fe-2S] cofactor: (1) a standard [2Fe-2S] cofactor and (2) a Rieske [2Fe-2S] cofactor. The accuracy for the model correctly identifying all three defined features was >95%, despite our perception of the increased challenge of the metalloenzyme identification. This demonstrates that machine learning methodology to classify and distinguish similar metal-binding sites, even in the absence of a visible cofactor, is indeed possible and offers an additional tool for metal-binding site identification in proteins.

Keywords: Rieske; amino acids; convolutional neural network; image classification; iron-sulfur; metal-binding sites; metalloenzyme.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

Amino Acid Sequence
Binding Sites
Iron-Sulfur Proteins* / chemistry
Metals / metabolism
Neural Networks, Computer

Substances

Iron-Sulfur Proteins
Metals