The benefits of permutation-based genome-wide association studies

J Exp Bot. 2024 Sep 11;75(17):5377-5389. doi: 10.1093/jxb/erae280.

Abstract

Linear mixed models (LMMs) are a commonly used method for genome-wide association studies (GWAS) that aim to detect associations between genetic markers and phenotypic measurements in a population of individuals while accounting for population structure and cryptic relatedness. In a standard GWAS, hundreds of thousands to millions of statistical tests are performed, requiring control for multiple hypothesis testing. Typically, static corrections that penalize the number of tests performed are used to control for the family-wise error rate, which is the probability of making at least one false positive. However, it has been shown that in practice this threshold is too conservative for normally distributed phenotypes and not stringent enough for non-normally distributed phenotypes. Therefore, permutation-based LMM approaches have recently been proposed to provide a more realistic threshold that takes phenotypic distributions into account. In this work, we discuss the advantages of permutation-based GWAS approaches, including new simulations and results from a re-analysis of all publicly available Arabidopsis phenotypes from the AraPheno database.

Keywords: Arabidopsis; GPU; genome-wide association studies (GWAS); linear mixed models; multiple hypothesis testing; permutations.

MeSH terms

  • Arabidopsis* / genetics
  • Computer Simulation
  • Genome-Wide Association Study*
  • Linear Models
  • Models, Genetic
  • Phenotype*