Motivation: Inference about differential expression is a typical objective when analyzing gene expression data. Recently, Bayesian hierarchical models have become increasingly popular for this type of problem. The two most common hierarchical models are the hierarchical Gamma-Gamma (GG) and Lognormal-Normal (LNN) models. However, to facilitate inference, some unrealistic assumptions have been made. One such assumption is that of a common coefficient of variation across genes, which can adversely affect the resulting inference.
Results: In this paper, we extend both the GG and LNN modeling frameworks to allow for gene-specific variances and propose EM based algorithms for parameter estimation. The proposed methodology is evaluated on three experimental datasets: one cDNA microarray experiment and two Affymetrix spike-in experiments. The two extended models significantly reduce the false positive rate while keeping a high sensitivity when compared to the originals. Finally, using a simulation study we show that the new frameworks are also more robust to model misspecification.
Availability: The R code for implementing the proposed methodology can be downloaded at http://www.stat.ubc.ca/~c.lo/FEBarrays.
Supplementary information: The supplementary material is available at http://www.stat.ubc.ca/~c.lo/FEBarrays/supp.pdf.