A complete and flexible workflow for metaproteomics data analysis based on MetaProteomeAnalyzer and Prophane

Nat Protoc. 2020 Oct;15(10):3212-3239. doi: 10.1038/s41596-020-0368-7. Epub 2020 Aug 28.

Abstract

Metaproteomics, the study of the collective protein composition of multi-organism systems, provides deep insights into the biodiversity of microbial communities and the complex functional interplay between microbes and their hosts or environment. Thus, metaproteomics has become an indispensable tool in various fields such as microbiology and related medical applications. The computational challenges in the analysis of corresponding datasets differ from those of pure-culture proteomics, e.g., due to the higher complexity of the samples and the larger reference databases demanding specific computing pipelines. Corresponding data analyses usually consist of numerous manual steps that must be closely synchronized. With MetaProteomeAnalyzer and Prophane, we have established two open-source software solutions specifically developed and optimized for metaproteomics. Among other features, peptide-spectrum matching is improved by combining different search engines and, compared to similar tools, metaproteome annotation benefits from the most comprehensive set of available databases (such as NCBI, UniProt, EggNOG, PFAM, and CAZy). The workflow described in this protocol combines both tools and leads the user through the entire data analysis process, including protein database creation, database search, protein grouping and annotation, and results visualization. To the best of our knowledge, this protocol presents the most comprehensive, detailed and flexible guide to metaproteomics data analysis to date. While beginners are provided with robust, easy-to-use, state-of-the-art data analysis in a reasonable time (a few hours, depending on, among other factors, the protein database size and the number of identified peptides and inferred proteins), advanced users benefit from the flexibility and adaptability of the workflow.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Data Analysis
  • Databases, Protein
  • Microbiota
  • Peptides / chemistry
  • Proteome / analysis*
  • Proteomics / methods*
  • Software
  • Workflow

Substances

  • Peptides
  • Proteome