Targeted design of synthetic enhancers for selected tissues in the Drosophila embryo.
Bernardo P de AlmeidaChristoph SchaubMichaela PaganiStefano SecchiaEileen E M FurlongAlexander StarkPublished in: Nature (2023)
Enhancers control gene expression and play crucial roles in development and homeostasis 1-3 . However, the targeted de novo design of enhancers with tissue-specific activities has remained challenging. Here, we combine deep learning and transfer learning to design tissue-specific enhancers for five tissues in the Drosophila melanogaster embryo - the central nervous system (CNS), epidermis, gut, muscle, and brain. We first train convolutional neural networks (CNNs) using genome-wide scATAC-seq datasets and then fine-tune the CNNs with smaller-scale data from in vivo enhancer activity assays, yielding models with 25% to 75% positive predictive value according to cross-validation. We designed and experimentally assessed 40 synthetic enhancers (eight per tissue) in vivo, of which 31 (78%) were active and 27 (68%) functioned in the target tissue (100% for CNS and muscle). The strategy to combine genome-wide and small-scale functional datasets by transfer learning is generally applicable and should enable the design of tissue-, cell type-, and cell state-specific enhancers in any system.
Keyphrases
- genome wide
- gene expression
- deep learning
- convolutional neural network
- dna methylation
- drosophila melanogaster
- single cell
- rna seq
- skeletal muscle
- blood brain barrier
- cancer therapy
- multiple sclerosis
- air pollution
- high throughput
- stem cells
- copy number
- electronic health record
- drug delivery
- big data
- mass spectrometry
- bone marrow
- functional connectivity
- binding protein
- high speed