Prediction of RNA-binding protein and alternative splicing event associations during epithelial-mesenchymal transition based on inductive matrix completion

Brief Bioinform. 2021 Sep 2;22(5):bbaa440. doi: 10.1093/bib/bbaa440.

Abstract

Motivation: The developmental process of epithelial-mesenchymal transition (EMT) is abnormally activated during breast cancer metastasis. Transcriptional regulatory networks that control EMT have been well studied; however, alternative RNA splicing plays a vital regulatory role during this process and the regulating mechanism needs further exploration. Because of the huge cost and complexity of biological experiments, the underlying mechanisms of alternative splicing (AS) and associated RNA-binding proteins (RBPs) that regulate the EMT process remain largely unknown. Thus, there is an urgent need to develop computational methods for predicting potential RBP-AS event associations during EMT.

Results: We developed a novel model for RBP-AS target prediction during EMT that is based on inductive matrix completion (RAIMC). Integrated RBP similarities were calculated based on RBP regulating similarity, and RBP Gaussian interaction profile (GIP) kernel similarity, while integrated AS event similarities were computed based on AS event module similarity and AS event GIP kernel similarity. Our primary objective was to complete missing or unknown RBP-AS event associations based on known associations and on integrated RBP and AS event similarities. In this paper, we identify significant RBPs for AS events during EMT and discuss potential regulating mechanisms. Our computational results confirm the effectiveness and superiority of our model over other state-of-the-art methods. Our RAIMC model achieved AUC values of 0.9587 and 0.9765 based on leave-one-out cross-validation (CV) and 5-fold CV, respectively, which are larger than the AUC values from the previous models. RAIMC is a general matrix completion framework that can be adopted to predict associations between other biological entities. We further validated the prediction performance of RAIMC on the genes CD44 and MAP3K7. RAIMC can identify the related regulating RBPs for isoforms of these two genes.

Availability and implementation: The source code for RAIMC is available at https://github.com/yushanqiu/RAIMC.

Contact: zouquan@nclab.net online.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing*
  • Breast Neoplasms* / genetics
  • Breast Neoplasms* / metabolism
  • Epithelial-Mesenchymal Transition / genetics*
  • Female
  • Gene Expression Regulation, Neoplastic*
  • Gene Regulatory Networks*
  • Humans
  • Neoplasm Proteins* / genetics
  • Neoplasm Proteins* / metabolism
  • RNA-Binding Proteins* / genetics
  • RNA-Binding Proteins* / metabolism

Substances

  • Neoplasm Proteins
  • RNA-Binding Proteins