PEP: Predictions for Entire Proteomes

Nucleic Acids Res. 2003 Jan 1;31(1):410-3. doi: 10.1093/nar/gkg102.

Abstract

PEP is a database of Predictions for Entire Proteomes. The database contains summaries of analyses of protein sequences from a range of organisms representing all three major kingdoms of life: eukaryotes, prokaryotes and archaea. All proteins publicly available for organisms were aligned against SWISS-PROT, TrEMBL and PDB. Additionally, the following annotations are provided: secondary structure, transmembrane helices, coiled coils, regions of low complexity, signal peptides, PROSITE motifs, nuclear localization signals and classes of cellular function. Proteins that contain long regions without regular secondary structure are also identified. We have produced a related database of structural domain-like fragments derived from PEP and clusters based on homology between all fragments. The PEP database, fragments and clusters are distributed freely as a set of flat files and have been integrated into SRS. The PEP group of databases can be accessed from: http://cubic.bioc.columbia.edu/pep.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Animals
  • Archaeal Proteins / chemistry
  • Cluster Analysis
  • Databases, Protein*
  • Eukaryotic Cells
  • Humans
  • Prokaryotic Cells
  • Protein Conformation
  • Proteins / chemistry
  • Proteome / chemistry*
  • Proteome / physiology
  • Sequence Homology, Amino Acid
  • User-Computer Interface

Substances

  • Archaeal Proteins
  • Proteins
  • Proteome