Rewriting regulatory DNA to dissect and reprogram gene expression.
Gabriella E MartynMichael T MontgomeryHank JonesKatherine GuoBenjamin R DoughtyJohannes LinderZiwei ChenKelly CochranKathryn A LawrenceGlen MunsonAnusri PampariCharles P FulcoDavid R KelleyEric S LanderAnshul KundajeJesse M EngreitzPublished in: bioRxiv : the preprint server for biology (2023)
Regulatory DNA sequences within enhancers and promoters bind transcription factors to encode cell type-specific patterns of gene expression. However, the regulatory effects and programmability of such DNA sequences remain difficult to map or predict because we have lacked scalable methods to precisely edit regulatory DNA and quantify the effects in an endogenous genomic context. Here we present an approach to measure the quantitative effects of hundreds of designed DNA sequence variants on gene expression, by combining pooled CRISPR prime editing with RNA fluorescence in situ hybridization and cell sorting (Variant-FlowFISH). We apply this method to mutagenize and rewrite regulatory DNA sequences in an enhancer and the promoter of PPIF in two immune cell lines. Of 672 variant-cell type pairs, we identify 497 that affect PPIF expression. These variants appear to act through a variety of mechanisms including disruption or optimization of existing transcription factor binding sites, as well as creation of de novo sites. Disrupting a single endogenous transcription factor binding site often led to large changes in expression (up to -40% in the enhancer, and -50% in the promoter). The same variant often had different effects across cell types and states, demonstrating a highly tunable regulatory landscape. We use these data to benchmark performance of sequence-based predictive models of gene regulation, and find that certain types of variants are not accurately predicted by existing models. Finally, we computationally design 185 small sequence variants (≤10 bp) and optimize them for specific effects on expression in silico . 84% of these rationally designed edits showed the intended direction of effect, and some had dramatic effects on expression (-100% to +202%). Variant-FlowFISH thus provides a powerful tool to map the effects of variants and transcription factor binding sites on gene expression, test and improve computational models of gene regulation, and reprogram regulatory DNA.
Keyphrases
- transcription factor
- gene expression
- circulating tumor
- single molecule
- cell free
- poor prognosis
- dna methylation
- dna binding
- copy number
- nucleic acid
- binding protein
- stem cells
- randomized controlled trial
- clinical trial
- long non coding rna
- circulating tumor cells
- genetic diversity
- molecular docking
- big data
- energy transfer
- phase iii