A Scale-Corrected Comparison of Linkage Disequilibrium Levels between Genic and Non-Genic Regions

PLoS One. 2015 Oct 30;10(10):e0141216. doi: 10.1371/journal.pone.0141216. eCollection 2015.

Abstract

The understanding of non-random association between loci, termed linkage disequilibrium (LD), plays a central role in genomic research. Since causal mutations are generally not included in genomic marker data, LD between those and available markers is essential for capturing the effects of causal loci on localizing genes responsible for traits. Thus, the interpretation of association studies requires a detailed knowledge of LD patterns. It is well known that most LD measures depend on minor allele frequencies (MAF) of the considered loci and the magnitude of LD is influenced by the physical distances between loci. In the present study, a procedure to compare the LD structure between genomic regions comprising several markers each is suggested. The approach accounts for different scaling factors, namely the distribution of MAF, the distribution of pair-wise differences in MAF, and the physical extent of compared regions, reflected by the distribution of pair-wise physical distances. In the first step, genomic regions are matched based on similarity in these scaling factors. In the second step, chromosome- and genome-wide significance tests for differences in medians of LD measures in each pair are performed. The proposed framework was applied to test the hypothesis that the average LD is different in genic and non-genic regions. This was tested with a genome-wide approach with data sets for humans (Homo sapiens), a highly selected chicken line (Gallus gallus domesticus) and the model plant Arabidopsis thaliana. In all three data sets we found a significantly higher level of LD in genic regions compared to non-genic regions. About 31% more LD was detected genome-wide in genic compared to non-genic regions in Arabidopsis thaliana, followed by 13.6% in human and 6% chicken. Chromosome-wide comparison discovered significant differences on all 5 chromosomes in Arabidopsis thaliana and on one third of the human and of the chicken chromosomes.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chickens / genetics*
  • Chromosome Mapping
  • Gene Frequency
  • Genome, Human
  • Genome, Plant
  • Genomics / methods*
  • Humans
  • Linkage Disequilibrium*

Associated data

  • dbGaP/PHS000091.V2.P1

Grants and funding

This study was financially supported by RTG 1644 ‘Scaling Problems in Statistics’ 22, financed by the German Research Foundation (DFG). The funder provided support in the form of salaries for author (SB), but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of this author are articulated in the ‘author contributions’ section. Chicken genotypes were generated in the AgroClustEr 23 “Synbreed – Synergistic Plant and Animal Breeding” (Funding ID: 0315528C) funded by the German 24 Federal Ministry of Education and Research.