The R protein of SARS-CoV: analyses of structure and function based on four complete genome sequences of isolates BJ01-BJ04

Genomics Proteomics Bioinformatics. 2003 May;1(2):155-65. doi: 10.1016/s1672-0229(03)01019-2.

Abstract

The R (replicase) protein is the uniquely defined non-structural protein (NSP) responsible for RNA replication, mutation rate or fidelity, regulation of transcription in coronaviruses and many other ssRNA viruses. Based on our complete genome sequences of four isolates (BJ01-BJ04) of SARS-CoV from Beijing, China, we analyzed the structure and predicted functions of the R protein in comparison with 13 other isolates of SARS-CoV and 6 other coronaviruses. The entire ORF (open-reading frame) encodes for two major enzyme activities, RNA-dependent RNA polymerase (RdRp) and proteinase activities. The R polyprotein undergoes a complex proteolytic process to produce 15 function-related peptides. A hydrophobic domain (HOD) and a hydrophilic domain (HID) are newly identified within NSP1. The substitution rate of the R protein is close to the average of the SARS-CoV genome. The functional domains in all NSPs of the R protein give different phylogenetic results that suggest their different mutation rate under selective pressure. Eleven highly conserved regions in RdRp and twelve cleavage sites by 3CLP (chymotrypsin-like protein) have been identified as potential drug targets. Findings suggest that it is possible to obtain information about the phylogeny of SARS-CoV, as well as potential tools for drug design, genotyping and diagnostics of SARS.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Base Composition
  • Base Sequence
  • Cluster Analysis
  • Computational Biology
  • Conserved Sequence / genetics
  • Evolution, Molecular
  • Gene Components
  • Genome, Viral*
  • Molecular Sequence Data
  • Mutation / genetics*
  • Phylogeny*
  • Protein Structure, Tertiary
  • RNA-Dependent RNA Polymerase / genetics*
  • Sequence Analysis, DNA
  • Severe acute respiratory syndrome-related coronavirus / genetics*

Substances

  • RNA-Dependent RNA Polymerase

Associated data

  • GENBANK/AF220295
  • GENBANK/AY274119
  • GENBANK/AY278489
  • GENBANK/AY278491
  • GENBANK/AY278554
  • GENBANK/AY278741
  • GENBANK/AY282752
  • GENBANK/AY283794
  • GENBANK/AY283795
  • GENBANK/AY283796
  • GENBANK/AY283797
  • GENBANK/AY283798
  • GENBANK/AY291451
  • GENBANK/AY297028
  • RefSeq/NC_001451
  • RefSeq/NC_002306
  • RefSeq/NC_002645
  • RefSeq/NC_003045
  • RefSeq/NC_003436
  • RefSeq/NC_004718
  • RefSeq/NC_008146