Login / Signup

Dissecting the cis-regulatory syntax of transcription initiation with deep learning.

Kelly CochranMelody YinAnika MantripragadaJacob SchreiberGeorgi K MarinovAnshul Kundaje
Published in: bioRxiv : the preprint server for biology (2024)
Despite extensive characterization of mammalian Pol II transcription, the DNA sequence determinants of transcription initiation at a third of human promoters and most enhancers remain poorly understood. Hence, we trained and interpreted a neural network called ProCapNet that accurately models base-resolution initiation profiles from PRO-cap experiments using local DNA sequence. ProCapNet learns sequence motifs with distinct effects on initiation rates and TSS positioning and uncovers context-specific cryptic initiator elements intertwined within other TF motifs. ProCapNet annotates predictive motifs in nearly all actively transcribed regulatory elements across multiple cell-lines, revealing a shared cis-regulatory logic across promoters and enhancers mediated by a highly epistatic sequence syntax of cooperative and competitive motif interactions. ProCapNet models of RAMPAGE profiles measuring steady-state RNA abundance at TSSs distill initiation signals on par with models trained directly on PRO-cap profiles. ProCapNet learns a largely cell-type-agnostic cis-regulatory code of initiation complementing sequence drivers of cell-type-specific chromatin state critical for accurate prediction of cell-type-specific transcription initiation.
Keyphrases
  • transcription factor
  • deep learning
  • neural network
  • circulating tumor
  • endothelial cells
  • cell free
  • machine learning
  • dna damage
  • amino acid
  • high resolution
  • anti inflammatory
  • convolutional neural network