Flexible empirical Bayes models for differential gene expression

Kenneth Lo; Raphael Gottardo

doi:10.1093/bioinformatics/btl612

Flexible empirical Bayes models for differential gene expression

Bioinformatics. 2007 Feb 1;23(3):328-35. doi: 10.1093/bioinformatics/btl612. Epub 2006 Nov 30.

Authors

Kenneth Lo¹, Raphael Gottardo

Affiliation

¹ Department of Statistics, University of British Columbia, 333-6356 Agricultural Road, Vancouver, BC, Canada V6T 1Z2. c.lo@stat.ubc.ca

PMID: 17138586
DOI: 10.1093/bioinformatics/btl612

Abstract

Motivation: Inference about differential expression is a typical objective when analyzing gene expression data. Recently, Bayesian hierarchical models have become increasingly popular for this type of problem. The two most common hierarchical models are the hierarchical Gamma-Gamma (GG) and Lognormal-Normal (LNN) models. However, to facilitate inference, some unrealistic assumptions have been made. One such assumption is that of a common coefficient of variation across genes, which can adversely affect the resulting inference.

Results: In this paper, we extend both the GG and LNN modeling frameworks to allow for gene-specific variances and propose EM based algorithms for parameter estimation. The proposed methodology is evaluated on three experimental datasets: one cDNA microarray experiment and two Affymetrix spike-in experiments. The two extended models significantly reduce the false positive rate while keeping a high sensitivity when compared to the originals. Finally, using a simulation study we show that the new frameworks are also more robust to model misspecification.

Availability: The R code for implementing the proposed methodology can be downloaded at http://www.stat.ubc.ca/~c.lo/FEBarrays.

Supplementary information: The supplementary material is available at http://www.stat.ubc.ca/~c.lo/FEBarrays/supp.pdf.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Algorithms*
Bayes Theorem
Data Interpretation, Statistical
Gene Expression Profiling / methods*
Logistic Models
Models, Genetic*
Models, Statistical
Oligonucleotide Array Sequence Analysis / methods*