Extracting a few functionally reproducible biomarkers to build robust subnetwork-based classifiers for the diagnosis of cancer

Gene. 2013 Sep 10;526(2):232-8. doi: 10.1016/j.gene.2013.05.011. Epub 2013 May 22.

Abstract

In microarray-based case-control studies of a disease, people often attempt to identify a few diagnostic or prognostic markers amongst the most significant differentially expressed (DE) genes. However, the reproducibility of DE genes identified in different studies for a disease is typically very low. To tackle the problem, we could evaluate the reproducibility of DE genes across studies and define robust markers for disease diagnosis using disease-associated protein-protein interaction (PPI) subnetwork. Using datasets for four cancer types, we found that the most significant DE genes in cancer exhibit consistent up- or down-regulation in different datasets. For each cancer type, the 5 (or 10) most significant DE genes separately extracted from different datasets tend to be significantly coexpressed and closely connected in the PPI subnetwork, thereby indicating that they are highly reproducible at the PPI level. Consequently, we were able to build robust subnetwork-based classifiers for cancer diagnosis.

Keywords: Cancer; DE; Diagnosis; FDR; Gene expression profiling; PO; POD; PON; PPI; Protein interaction networks; RFE; Reproducibility of biomarkers; SAM; SVM; differentially expressed; false discovery rate; percentage of overlap; percentage of overlap in the PPI network; percentage of overlapping deregulations; protein–protein interaction; recursive feature elimination; significance analysis of microarray; support vector machine.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Biomarkers, Tumor / genetics*
  • Biomarkers, Tumor / metabolism
  • Case-Control Studies
  • Computational Biology
  • Gene Expression Profiling*
  • Gene Expression Regulation, Neoplastic*
  • Humans
  • Neoplasms / diagnosis*
  • Neoplasms / genetics*
  • Neoplasms / metabolism
  • Protein Interaction Mapping
  • Protein Interaction Maps
  • Reproducibility of Results

Substances

  • Biomarkers, Tumor