Comparative genomics and evolution of proteins involved in RNA metabolism

Nucleic Acids Res. 2002 Apr 1;30(7):1427-64. doi: 10.1093/nar/30.7.1427.

Abstract

RNA metabolism, broadly defined as the compendium of all processes that involve RNA, including transcription, processing and modification of transcripts, translation, RNA degradation and its regulation, is the central and most evolutionarily conserved part of cell physiology. A comprehensive, genome-wide census of all enzymatic and non-enzymatic protein domains involved in RNA metabolism was conducted by using sequence profile analysis and structural comparisons. Proteins related to RNA metabolism comprise from 3 to 11% of the complete protein repertoire in bacteria, archaea and eukaryotes, with the greatest fraction seen in parasitic bacteria with small genomes. Approximately one-half of protein domains involved in RNA metabolism are present in most, if not all, species from all three primary kingdoms and are traceable to the last universal common ancestor (LUCA). The principal features of LUCA's RNA metabolism system were reconstructed by parsimony-based evolutionary analysis of all relevant groups of orthologous proteins. This reconstruction shows that LUCA possessed not only the basal translation system, but also the principal forms of RNA modification, such as methylation, pseudouridylation and thiouridylation, as well as simple mechanisms for polyadenylation and RNA degradation. Some of these ancient domains form paralogous groups whose evolution can be traced back in time beyond LUCA, towards low-specificity proteins, which probably functioned as cofactors for ribozymes within the RNA world framework. The main lineage-specific innovations of RNA metabolism systems were identified. The most notable phase of innovation in RNA metabolism coincides with the advent of eukaryotes and was brought about by the merge of the archaeal and bacterial systems via mitochondrial endosymbiosis, but also involved emergence of several new, eukaryote-specific RNA-binding domains. Subsequent, vast expansions of these domains mark the origin of alternative splicing in animals and probably in plants. In addition to the reconstruction of the evolutionary history of RNA metabolism, this analysis produced numerous functional predictions, e.g. of previously undetected enzymes of RNA modification.

Publication types

  • Comparative Study

MeSH terms

  • Animals
  • Bacterial Proteins / genetics
  • Bacterial Proteins / metabolism*
  • Binding Sites / genetics
  • Databases, Nucleic Acid
  • Enzymes / genetics
  • Enzymes / metabolism
  • Evolution, Molecular*
  • Genetic Variation
  • Genome, Bacterial
  • Genomics
  • Humans
  • RNA, Bacterial / genetics
  • RNA, Bacterial / metabolism*

Substances

  • Bacterial Proteins
  • Enzymes
  • RNA, Bacterial