Genome-wide identification of clusters of predicted microRNA binding sites as microRNA sponge candidates

PLoS One. 2018 Aug 24;13(8):e0202369. doi: 10.1371/journal.pone.0202369. eCollection 2018.

Abstract

The number of discovered natural miRNA sponges in plants, viruses, and mammals is increasing steadily. Some sponges like ciRS-7 for miR-7 contain multiple nearby miRNA binding sites. We hypothesize that such clusters of miRNA binding sites on the genome can function together as a sponge. No systematic effort has been made in search for clusters of miRNA targets. Here, we, to our knowledge, make the first genome-wide target site predictions for clusters of mature human miRNAs. For each miRNA, we predict the target sites on a genome-wide scale, build a graph with edge weights based on the pairwise distances between sites, and apply Markov clustering to identify genomic regions with high binding site density. Significant clusters are then extracted based on cluster size difference between real and shuffled genomes preserving local properties such as the GC content. We then use conservation and binding energy to filter a final set of miRNA target site clusters or sponge candidates. Our pipeline predicts 3673 sponge candidates for 1250 miRNAs, including the experimentally verified miR-7 sponge ciRS-7. In addition, we point explicitly to 19 high-confidence candidates overlapping annotated genomic sequence. The full list of candidates is freely available at http://rth.dk/resources/mirnasponge, where detailed properties for individual candidates can be explored, such as alignment details, conservation, accessibility and target profiles, which facilitates selection of sponge candidates for further context specific analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Genome, Human*
  • Genome-Wide Association Study
  • Humans
  • MicroRNAs / genetics*
  • MicroRNAs / metabolism
  • Sequence Analysis, RNA / methods*

Substances

  • MIRN7 microRNA, human
  • MicroRNAs

Grants and funding

This project was mainly financed by University of Copenhagen with additional support from the Novo Nordisk Foundation [NNF14CC0001], the Danish Center for Scientific Computing (DCSC/DEiC), and Innovation Fund Denmark (Programme Commission on Strategic Growth Technologies) [0603-00320B].