Extracting a few functionally reproducible biomarkers to build robust subnetwork-based classifiers for the diagnosis of cancer

Lin Zhang; Shan Li; Chunxiang Hao; Guini Hong; Jinfeng Zou; Yuannv Zhang; Pengfei Li; Zheng Guo

doi:10.1016/j.gene.2013.05.011

Extracting a few functionally reproducible biomarkers to build robust subnetwork-based classifiers for the diagnosis of cancer

Gene. 2013 Sep 10;526(2):232-8. doi: 10.1016/j.gene.2013.05.011. Epub 2013 May 22.

Authors

Lin Zhang¹, Shan Li, Chunxiang Hao, Guini Hong, Jinfeng Zou, Yuannv Zhang, Pengfei Li, Zheng Guo

Affiliation

¹ Bioinformatics Centre, Key Laboratory for NeuroInformation of Ministry of Education and School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China. linzhang.bioinformatics@gmail.com

PMID: 23707927
DOI: 10.1016/j.gene.2013.05.011

Abstract

In microarray-based case-control studies of a disease, people often attempt to identify a few diagnostic or prognostic markers amongst the most significant differentially expressed (DE) genes. However, the reproducibility of DE genes identified in different studies for a disease is typically very low. To tackle the problem, we could evaluate the reproducibility of DE genes across studies and define robust markers for disease diagnosis using disease-associated protein-protein interaction (PPI) subnetwork. Using datasets for four cancer types, we found that the most significant DE genes in cancer exhibit consistent up- or down-regulation in different datasets. For each cancer type, the 5 (or 10) most significant DE genes separately extracted from different datasets tend to be significantly coexpressed and closely connected in the PPI subnetwork, thereby indicating that they are highly reproducible at the PPI level. Consequently, we were able to build robust subnetwork-based classifiers for cancer diagnosis.

Keywords: Cancer; DE; Diagnosis; FDR; Gene expression profiling; PO; POD; PON; PPI; Protein interaction networks; RFE; Reproducibility of biomarkers; SAM; SVM; differentially expressed; false discovery rate; percentage of overlap; percentage of overlap in the PPI network; percentage of overlapping deregulations; protein–protein interaction; recursive feature elimination; significance analysis of microarray; support vector machine.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Biomarkers, Tumor / genetics*
Biomarkers, Tumor / metabolism
Case-Control Studies
Computational Biology
Gene Expression Profiling*
Gene Expression Regulation, Neoplastic*
Humans
Neoplasms / diagnosis*
Neoplasms / genetics*
Neoplasms / metabolism
Protein Interaction Mapping
Protein Interaction Maps
Reproducibility of Results

Substances

Biomarkers, Tumor