Transcription factor profiling reveals molecular choreography and key regulators of human retrotransposon expression

Proc Natl Acad Sci U S A. 2018 Jun 12;115(24):E5526-E5535. doi: 10.1073/pnas.1722565115. Epub 2018 May 25.

Abstract

Transposable elements (TEs) represent a substantial fraction of many eukaryotic genomes, and transcriptional regulation of these factors is important to determine TE activities in human cells. However, due to the repetitive nature of TEs, identifying transcription factor (TF)-binding sites from ChIP-sequencing (ChIP-seq) datasets is challenging. Current algorithms are focused on subtle differences between TE copies and thus bias the analysis to relatively old and inactive TEs. Here we describe an approach termed "MapRRCon" (mapping repeat reads to a consensus) which allows us to identify proteins binding to TE DNA sequences by mapping ChIP-seq reads to the TE consensus sequence after whole-genome alignment. Although this method does not assign binding sites to individual insertions in the genome, it provides a landscape of interacting TFs by capturing factors that bind to TEs under various conditions. We applied this method to screen TFs' interaction with L1 in human cells/tissues using ENCODE ChIP-seq datasets and identified 178 of the 512 TFs tested as bound to L1 in at least one biological condition with most of them (138) localized to the promoter. Among these L1-binding factors, we focused on Myc and CTCF, as they play important roles in cancer progression and 3D chromatin structure formation. Furthermore, we explored the transcriptomes of The Cancer Genome Atlas breast and ovarian tumor samples in which a consistent anti-/correlation between L1 and Myc/CTCF expression was observed, suggesting that these two factors may play roles in regulating L1 transcription during the development of such tumors.

Keywords: CTCF; ChIP-seq; ENCODE; LINE-1; Myc.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms
  • Breast Neoplasms / genetics
  • Chromatin / genetics
  • Female
  • Gene Expression Regulation / genetics*
  • Genome / genetics
  • Humans
  • Long Interspersed Nucleotide Elements / genetics
  • Ovarian Neoplasms / genetics
  • Promoter Regions, Genetic / genetics
  • Protein Binding / genetics
  • Regulatory Elements, Transcriptional / genetics*
  • Retroelements / genetics*
  • Transcription Factors / genetics*
  • Transcriptome / genetics

Substances

  • Chromatin
  • Retroelements
  • Transcription Factors