Characterization of the epigenome promises to yield the functional elements buried in the human genome sequence, thus helping to annotate non-coding DNA polymorphisms with regulatory functions. Here, we develop two novel strategies to combine epigenomic data with transcriptomic profiles in humans or mice to prioritize potential candidate SNPs associated with lipid levels by genome-wide association study (GWAS). First, after confirming that lipid-associated loci that are also expression quantitative trait loci (eQTL) in human livers are enriched for ENCODE regulatory marks in the human hepatocellular HepG2 cell line, we prioritize candidate SNPs based on the number of these marks that overlap the variant position. This method recognized the known SORT1 rs12740374 regulatory SNP associated with LDL-cholesterol, and highlighted candidate functional SNPs at 15 additional lipid loci. In the second strategy, we combine ENCODE chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq) data and liver expression datasets from knockout mice lacking specific transcription factors. This approach identified SNPs in specific transcription factor binding sites that are located near target genes of these transcription factors. We show that FOXA2 transcription factor binding sites are enriched at lipid-associated loci and experimentally validate that alleles of one such proxy SNP located near the FOXA2 target gene BIRC5 show allelic differences in FOXA2-DNA binding and enhancer activity. These methods can be used to generate testable hypotheses for many non-coding SNPs associated with complex diseases or traits.
Keywords: BIRC5; ENCODE; FOXA1; FOXA2; GWAS; HNF4A.
Copyright © 2014 Elsevier Inc. All rights reserved.