High-resolution functional annotation of human transcriptome: predicting isoform functions by a novel multiple instance-based label propagation method

Nucleic Acids Res. 2014 Apr;42(6):e39. doi: 10.1093/nar/gkt1362. Epub 2013 Dec 25.

Abstract

Alternative transcript processing is an important mechanism for generating functional diversity in genes. However, little is known about the precise functions of individual isoforms. In fact, proteins (translated from transcript isoforms), not genes, are the function carriers. By integrating multiple human RNA-seq data sets, we carried out the first systematic prediction of isoform functions, enabling high-resolution functional annotation of human transcriptome. Unlike gene function prediction, isoform function prediction faces a unique challenge: the lack of the training data--all known functional annotations are at the gene level. To address this challenge, we modelled the gene-isoform relationships as multiple instance data and developed a novel label propagation method to predict functions. Our method achieved an average area under the receiver operating characteristic curve of 0.67 and assigned functions to 15 572 isoforms. Interestingly, we observed that different functions have different sensitivities to alternative isoform processing, and that the function diversity of isoforms from the same gene is positively correlated with their tissue expression diversity. Finally, we surveyed the literature to validate our predictions for a number of apoptotic genes. Strikingly, for the famous 'TP53' gene, we not only accurately identified the apoptosis regulation function of its five isoforms, but also correctly predicted the precise direction of the regulation.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Apoptosis
  • Gene Expression Profiling*
  • Gene Regulatory Networks
  • Humans
  • Molecular Sequence Annotation*
  • Protein Isoforms / genetics
  • Protein Isoforms / metabolism
  • Protein Isoforms / physiology*
  • RNA Isoforms / metabolism
  • Sequence Analysis, RNA*

Substances

  • Protein Isoforms
  • RNA Isoforms