Identification of candidate regulatory sequences in mammalian 3' UTRs by statistical analysis of oligonucleotide distributions

BMC Bioinformatics. 2007 May 24:8:174. doi: 10.1186/1471-2105-8-174.

Abstract

Background: 3' untranslated regions (3' UTRs) contain binding sites for many regulatory elements, and in particular for microRNAs (miRNAs). The importance of miRNA-mediated post-transcriptional regulation has become increasingly clear in the last few years.

Results: We propose two complementary approaches to the statistical analysis of oligonucleotide frequencies in mammalian 3' UTRs aimed at the identification of candidate binding sites for regulatory elements. The first method is based on the identification of sets of genes characterized by evolutionarily conserved overrepresentation of an oligonucleotide. The second method is based on the identification of oligonucleotides showing statistically significant strand asymmetry in their distribution in 3' UTRs.

Conclusion: Both methods are able to identify many previously known binding sites located in 3'UTRs, and in particular seed regions of known miRNAs. Many new candidates are proposed for experimental verification.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions / genetics*
  • Algorithms*
  • Base Sequence
  • Data Interpretation, Statistical
  • MicroRNAs / genetics*
  • Models, Genetic
  • Models, Statistical
  • Molecular Sequence Data
  • Oligonucleotides / genetics*
  • Regulatory Elements, Transcriptional / genetics*
  • Sequence Analysis, RNA / methods*

Substances

  • 3' Untranslated Regions
  • MicroRNAs
  • Oligonucleotides