PEP: Predictions for Entire Proteomes

Phil Carter; Jinfeng Liu; Burkhard Rost

doi:10.1093/nar/gkg102

PEP: Predictions for Entire Proteomes

Nucleic Acids Res. 2003 Jan 1;31(1):410-3. doi: 10.1093/nar/gkg102.

Authors

Phil Carter¹, Jinfeng Liu, Burkhard Rost

Affiliation

¹ CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA. carter@cubic.bioc.columbia.edu

Abstract

PEP is a database of Predictions for Entire Proteomes. The database contains summaries of analyses of protein sequences from a range of organisms representing all three major kingdoms of life: eukaryotes, prokaryotes and archaea. All proteins publicly available for organisms were aligned against SWISS-PROT, TrEMBL and PDB. Additionally, the following annotations are provided: secondary structure, transmembrane helices, coiled coils, regions of low complexity, signal peptides, PROSITE motifs, nuclear localization signals and classes of cellular function. Proteins that contain long regions without regular secondary structure are also identified. We have produced a related database of structural domain-like fragments derived from PEP and clusters based on homology between all fragments. The PEP database, fragments and clusters are distributed freely as a set of flat files and have been integrated into SRS. The PEP group of databases can be accessed from: http://cubic.bioc.columbia.edu/pep.

Publication types

Research Support, U.S. Gov't, P.H.S.

MeSH terms

Animals
Archaeal Proteins / chemistry
Cluster Analysis
Databases, Protein*
Eukaryotic Cells
Humans
Prokaryotic Cells
Protein Conformation
Proteins / chemistry
Proteome / chemistry*
Proteome / physiology
Sequence Homology, Amino Acid
User-Computer Interface

Substances

Archaeal Proteins
Proteins
Proteome

Abstract

Publication types

MeSH terms

Substances

Grants and funding