Synthetic STARR-seq reveals how DNA shape and sequence modulate transcriptional output and noise.
Stefanie SchöneMelissa BotheEdda EinfeldtMarina BorschiwerPhilipp BennerMartin VingronMorgane Thomas-ChollierSebastiaan H MeijsingPublished in: PLoS genetics (2018)
The binding of transcription factors to short recognition sequences plays a pivotal role in controlling the expression of genes. The sequence and shape characteristics of binding sites influence DNA binding specificity and have also been implicated in modulating the activity of transcription factors downstream of binding. To quantitatively assess the transcriptional activity of tens of thousands of designed synthetic sites in parallel, we developed a synthetic version of STARR-seq (synSTARR-seq). We used the approach to systematically analyze how variations in the recognition sequence of the glucocorticoid receptor (GR) affect transcriptional regulation. Our approach resulted in the identification of a novel highly active functional GR binding sequence and revealed that sequence variation both within and flanking GR's core binding site can modulate GR activity without apparent changes in DNA binding affinity. Notably, we found that the sequence composition of variants with similar activity profiles was highly diverse. In contrast, groups of variants with similar activity profiles showed specific DNA shape characteristics indicating that DNA shape may be a better predictor of activity than DNA sequence. Finally, using single cell experiments with individual enhancer variants, we obtained clues indicating that the architecture of the response element can independently tune expression mean and cell-to cell variability in gene expression (noise). Together, our studies establish synSTARR as a powerful method to systematically study how DNA sequence and shape modulate transcriptional output and noise.
Keyphrases
- dna binding
- single cell
- transcription factor
- gene expression
- rna seq
- circulating tumor
- single molecule
- genome wide
- cell free
- amino acid
- poor prognosis
- air pollution
- copy number
- stem cells
- dna methylation
- high throughput
- magnetic resonance
- computed tomography
- signaling pathway
- genome wide identification
- bone marrow
- mass spectrometry
- mesenchymal stem cells
- heat shock