Enrichment of statistical power for genome-wide association studies

BMC Biol. 2014 Oct 17:12:73. doi: 10.1186/s12915-014-0073-5.

Abstract

Background: The inheritance of most human diseases and agriculturally important traits is controlled by many genes with small effects. Identifying these genes, while simultaneously controlling false positives, is challenging. Among available statistical methods, the mixed linear model (MLM) has been the most flexible and powerful for controlling population structure and individual unequal relatedness (kinship), the two common causes of spurious associations. The introduction of the compressed MLM (CMLM) method provided additional opportunities for optimization by adding two new model parameters: grouping algorithms and number of groups.

Results: This study introduces another model parameter to develop an enriched CMLM (ECMLM). The parameter involves algorithms to define kinship between groups (that is, kinship algorithms). The ECMLM calculates kinship using several different algorithms and then chooses the best combination between kinship algorithms and grouping algorithms.

Conclusion: Simulations show that the ECMLM increases statistical power. In some cases, the magnitude of power gained by using ECMLM instead of CMLM is larger than the improvement found by using CMLM instead of MLM.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Arabidopsis / genetics*
  • Dogs / genetics*
  • Genome, Plant*
  • Genome-Wide Association Study / methods*
  • Genome-Wide Association Study / veterinary
  • Humans
  • Linear Models
  • Models, Genetic
  • Models, Statistical
  • Zea mays / genetics*