Sequence determinants of human gene regulatory elements

Nat Genet. 2022 Mar;54(3):283-294. doi: 10.1038/s41588-021-01009-4. Epub 2022 Feb 21.

Abstract

DNA can determine where and when genes are expressed, but the full set of sequence determinants that control gene expression is unknown. Here, we measured the transcriptional activity of DNA sequences that represent an ~100 times larger sequence space than the human genome using massively parallel reporter assays (MPRAs). Machine learning models revealed that transcription factors (TFs) generally act in an additive manner with weak grammar and that most enhancers increase expression from a promoter by a mechanism that does not appear to involve specific TF-TF interactions. The enhancers themselves can be classified into three types: classical, closed chromatin and chromatin dependent. We also show that few TFs are strongly active in a cell, with most activities being similar between cell types. Individual TFs can have multiple gene regulatory activities, including chromatin opening and enhancing, promoting and determining transcription start site (TSS) activity, consistent with the view that the TF binding motif is the key atomic unit of gene expression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Binding Sites / genetics
  • Genome, Human / genetics
  • Humans
  • Protein Binding
  • Regulatory Sequences, Nucleic Acid* / genetics
  • Transcription Factors* / genetics
  • Transcription Factors* / metabolism

Substances

  • Transcription Factors