Domain tree-based analysis of protein architecture evolution

Mol Biol Evol. 2008 Feb;25(2):254-64. doi: 10.1093/molbev/msm254. Epub 2007 Nov 19.

Abstract

Understanding the dynamics behind domain architecture evolution is of great importance to unravel the functions of proteins. Complex architectures have been created throughout evolution by rearrangement and duplication events. An interesting question is how many times a particular architecture has been created, a form of convergent evolution or domain architecture reinvention. Previous studies have approached this issue by comparing architectures found in different species. We wanted to achieve a finer-grained analysis by reconstructing protein architectures on complete domain trees. The prevalence of domain architecture reinvention in 96 genomes was investigated with a novel domain tree-based method that uses maximum parsimony for inferring ancestral protein architectures. Domain architectures were taken from Pfam. To ensure robustness, we applied the method to bootstrap trees and only considered results with strong statistical support. We detected multiple origins for 12.4% of the scored architectures. In a much smaller data set, the subset of completely domain-assigned proteins, the figure was 5.6%. These results indicate that domain architecture reinvention is a much more common phenomenon than previously thought. We also determined which domains are most frequent in multiply created architectures and assessed whether specific functions could be attributed to them. However, no strong functional bias was found in architectures with multiple origins.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computational Biology
  • Evolution, Molecular*
  • Protein Structure, Tertiary*
  • Software