Interpreting Cis -Regulatory Interactions from Large-Scale Deep Neural Networks for Genomics.
Shushan ToneyanPeter K KooPublished in: bioRxiv : the preprint server for biology (2023)
The rise of large-scale, sequence-based deep neural networks (DNNs) for predicting gene expression has introduced challenges in their evaluation and interpretation. Current evaluations align DNN predictions with experimental perturbation assays, offering a limited perspective of the DNN's capabilities within the studied loci. Moreover, existing model explainability tools mainly focus on motif analysis, which becomes complex to interpret for longer sequences. Here we introduce CREME, an in silico perturbation toolkit that interrogates large-scale DNNs to uncover rules of gene regulation that it has learned. Using CREME, we investigate Enformer, a prominent DNN in gene expression prediction, revealing cis -regulatory elements (CREs) that directly enhance or silence target genes. We explore the relationship between CRE distance from transcription start sites and gene expression, as well as the intricate complexity of higher-order CRE interactions. This work advances the ability to translate the powerful predictions of large-scale DNNs to study open questions in gene regulation.