Genomic studies of plants often seek to identify genetic factors associated with desirable traits. The process of evaluating genetic markers one by one (i.e. a marginal analysis) may not identify important polygenic and environmental effects. Further, confounding due to growing conditions/factors and genetic similarities among plant varieties may influence conclusions. When developing new plant varieties to optimize yield or thrive in future adverse conditions (e.g. flood, drought), scientists seek a complete understanding of how the factors influence desirable traits. Motivated by a study design that measures rice yield across different seasons, fields, and plant varieties in Indonesia, we develop a regression method that identifies significant genomic factors, while simultaneously controlling for field factors and genetic similarities in the plant varieties. Our approach develops a Bayesian maximum a posteriori probability (MAP) estimator under a generalized double Pareto shrinkage prior. Through a hierarchical representation of the proposed model, a novel and computationally efficient expectation-maximization (EM) algorithm is developed for variable selection and estimation. The performance of the proposed approach is demonstrated through simulation and is used to analyze rice yields from a pilot study conducted by the Indonesian Center for Rice Research.
Keywords: Bayesian hierarchical models; EM algorithm; MAP estimator; genomic studies; rice science; shrinkage prior.