DTi2Vec: Drug-target interaction prediction using network embedding and ensemble learning

Maha A Thafar; Rawan S Olayan; Somayah Albaradei; Vladimir B Bajic; Takashi Gojobori; Magbubah Essack; Xin Gao

doi:10.1186/s13321-021-00552-w

DTi2Vec: Drug-target interaction prediction using network embedding and ensemble learning

J Cheminform. 2021 Sep 22;13(1):71. doi: 10.1186/s13321-021-00552-w.

Authors

Maha A Thafar^{1

2}, Rawan S Olayan³, Somayah Albaradei^{1

4}, Vladimir B Bajic¹, Takashi Gojobori¹, Magbubah Essack⁵, Xin Gao⁶

Affiliations

¹ Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia.
² College of Computers and Information Technology, Computer Science Department, Taif University, Taif, Kingdom of Saudi Arabia.
³ The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA.
⁴ Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia.
⁵ Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia. magbubah.essack@kaust.edu.sa.
⁶ Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Kingdom of Saudi Arabia. xin.gao@kaust.edu.sa.

Abstract

Drug-target interaction (DTI) prediction is a crucial step in drug discovery and repositioning as it reduces experimental validation costs if done right. Thus, developing in-silico methods to predict potential DTI has become a competitive research niche, with one of its main focuses being improving the prediction accuracy. Using machine learning (ML) models for this task, specifically network-based approaches, is effective and has shown great advantages over the other computational methods. However, ML model development involves upstream hand-crafted feature extraction and other processes that impact prediction accuracy. Thus, network-based representation learning techniques that provide automated feature extraction combined with traditional ML classifiers dealing with downstream link prediction tasks may be better-suited paradigms. Here, we present such a method, DTi2Vec, which identifies DTIs using network representation learning and ensemble learning techniques. DTi2Vec constructs the heterogeneous network, and then it automatically generates features for each drug and target using the nodes embedding technique. DTi2Vec demonstrated its ability in drug-target link prediction compared to several state-of-the-art network-based methods, using four benchmark datasets and large-scale data compiled from DrugBank. DTi2Vec showed a statistically significant increase in the prediction performances in terms of AUPR. We verified the "novel" predicted DTIs using several databases and scientific literature. DTi2Vec is a simple yet effective method that provides high DTI prediction performance while being scalable and efficient in computation, translating into a powerful drug repositioning tool.

Keywords: Cheminformatics; Drug repositioning; Drug–target interaction; Ensemble learning; Heterogeneous network; Link prediction; Network embedding; Random walk; Representation learning.

Abstract

Grants and funding