PAND: A Distribution to Identify Functional Linkage from Networks with Preferential Attachment Property

Hua Li; Pan Tong; Juan Gallegos; Emily Dimmer; Guoshuai Cai; Jeffrey J Molldrem; Shoudan Liang

doi:10.1371/journal.pone.0127968

PAND: A Distribution to Identify Functional Linkage from Networks with Preferential Attachment Property

PLoS One. 2015 Jul 9;10(7):e0127968. doi: 10.1371/journal.pone.0127968. eCollection 2015.

Authors

Hua Li¹, Pan Tong², Juan Gallegos³, Emily Dimmer⁴, Guoshuai Cai², Jeffrey J Molldrem⁵, Shoudan Liang⁶

Affiliations

¹ Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, United States of America.
² Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, United States of America.
³ Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, 77030, United States of America.
⁴ The EMBL Outstation-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.
⁵ Department of Stem Cell Transplantation and Cellular Therapy, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, United States of America.
⁶ Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China; Department of Bioinformatics and Computational Biology, The University of Texas MD Anderson Cancer Center, Houston, Texas, 77030, United States of America.

Abstract

Technology advances have immensely accelerated large-scale mapping of biological networks, which necessitates the development of accurate and powerful network-based algorithms to make functional inferences. A prevailing approach is to leverage functions of neighboring nodes to predict unknown molecular function. However, existing neighbor-based algorithms have ignored the scale-free property hidden in many biological networks. By assuming that neighbor sharing is constrained by the preferential attachment property, we developed a Preferential Attachment based common Neighbor Distribution (PAND) to calculate the probability of the neighbor-sharing event between any two nodes in scale-free networks, which nearly perfectly matched the observed probability in simulations. By applying PAND to a human protein-protein interaction (PPI) network, we showed that smaller probabilities represented closer functional linkages between proteins. With the PAND-derive linkages, we were able to build new networks where the links are more functionally reliable than those of the human PPI network. We then applied simple annotation schemes to a PAND-derived network to make reliable functional predictions for proteins. We also developed an R package called PANDA (PAND-derived functional Associations) to implement the methods proposed in this study. In conclusion, PAND is a useful distribution to calculate the probability of the neighbor-sharing events in scale-free networks. With PAND, we are able to extract reliable functional linkages from real biological networks and builds new networks that are better bases for further functional inference.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computational Biology / methods*
Databases, Factual
Humans
Protein Interaction Maps

Grants and funding

P50 CA100632/CA/NCI NIH HHS/United States