An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets

Eleftheria Zeggini; William Rayner; Andrew P Morris; Andrew T Hattersley; Mark Walker; Graham A Hitman; Panos Deloukas; Lon R Cardon; Mark I McCarthy

doi:10.1038/ng1670

An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets

Nat Genet. 2005 Dec;37(12):1320-2. doi: 10.1038/ng1670. Epub 2005 Oct 30.

Authors

Eleftheria Zeggini¹, William Rayner, Andrew P Morris, Andrew T Hattersley, Mark Walker, Graham A Hitman, Panos Deloukas, Lon R Cardon, Mark I McCarthy

Affiliation

¹ Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK. elez@well.ox.ac.uk

PMID: 16258542
DOI: 10.1038/ng1670

Abstract

A substantial investment has been made in the generation of large public resources designed to enable the identification of tag SNP sets, but data establishing the adequacy of the sample sizes used are limited. Using large-scale empirical and simulated data sets, we found that the sample sizes used in the HapMap project are sufficient to capture common variation, but that performance declines substantially for variants with minor allele frequencies of <5%.

Publication types

Comparative Study
Evaluation Study
Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Chromosome Mapping*
Databases, Nucleic Acid*
Diabetes Mellitus, Type 2 / genetics*
Gene Frequency
Genetic Predisposition to Disease*
Genome, Human / genetics*
Humans
Linkage Disequilibrium
Polymorphism, Single Nucleotide*
Sample Size