Effects of differential genotyping error rate on the type I error probability of case-control studies

Valentina Moskvina; Nick Craddock; Peter Holmans; Michael J Owen; Michael C O'Donovan

doi:10.1159/000092553

Effects of differential genotyping error rate on the type I error probability of case-control studies

Hum Hered. 2006;61(1):55-64. doi: 10.1159/000092553. Epub 2006 Apr 6.

Authors

Valentina Moskvina¹, Nick Craddock, Peter Holmans, Michael J Owen, Michael C O'Donovan

Affiliation

¹ Bioinformatics and Biostatistics Unit, School of Medicine, Wales College of Medicine, Cardiff University, Cardiff, UK. MoskvinaV1@cardiff.ac.uk

PMID: 16612103
DOI: 10.1159/000092553

Abstract

Objectives: It is well known that genotyping error adversely affects the power of genetic case-control association studies but there is little research on its effects on type I error, and none that has addressed possible differences in genotype error rates between cases and controls.

Methods: We used simulations to examine the influence of genotyping error on the type I error probability given by case-control studies. The effect of genotyping error on the magnitude of type I error was explored for a single marker of varying minor allele frequency (MAF), and for haplotypic tests based on two markers with varying MAF and linkage disequilibrium (LD) measure r(2).

Results: We show that even with low genotyping error rates (<0.01), systematic differences in the error rate between samples can result in type I error rates substantially above 0.05. The effect was maximal for markers with small MAF, markers in strong LD, and where a common allele is more frequently misclassified as a rare allele than vice versa. The problem was also exacerbated by the use of large samples.

Conclusions: Our results show that small differential genotyping error rates between cases and controls pose significant problems for association analyses. Differential genotyping error rates are particularly likely to arise where genotype data are combined from multiple sites, or where case genotypes are examined against archived reference population cohort genotypes that are being generated in several countries. Although these strategies may be necessary to obtain adequately powered samples, our data show the importance of stringent quality control. Furthermore, associations based on rare haplotypes should be treated with caution.

MeSH terms

Alleles
Case-Control Studies
Computer Simulation
Data Interpretation, Statistical*
Gene Frequency
Genotype*
Haplotypes
Humans
Linkage Disequilibrium
Models, Genetic*
Models, Statistical
Probability
Reproducibility of Results

Grants and funding

G9810900/MRC_/Medical Research Council/United Kingdom