A generalizable framework to comprehensively predict epigenome, chromatin organization, and transcriptome.
Zhenhao ZhangFan FengYiyang QiuJie LiuPublished in: Nucleic acids research (2023)
Many deep learning approaches have been proposed to predict epigenetic profiles, chromatin organization, and transcription activity. While these approaches achieve satisfactory performance in predicting one modality from another, the learned representations are not generalizable across predictive tasks or across cell types. In this paper, we propose a deep learning approach named EPCOT which employs a pre-training and fine-tuning framework, and is able to accurately and comprehensively predict multiple modalities including epigenome, chromatin organization, transcriptome, and enhancer activity for new cell types, by only requiring cell-type specific chromatin accessibility profiles. Many of these predicted modalities, such as Micro-C and ChIA-PET, are quite expensive to get in practice, and the in silico prediction from EPCOT should be quite helpful. Furthermore, this pre-training and fine-tuning framework allows EPCOT to identify generic representations generalizable across different predictive tasks. Interpreting EPCOT models also provides biological insights including mapping between different genomic modalities, identifying TF sequence binding patterns, and analyzing cell-type specific TF impacts on enhancer activity.
Keyphrases
- transcription factor
- gene expression
- genome wide
- dna methylation
- deep learning
- single cell
- working memory
- dna damage
- rna seq
- air pollution
- cell therapy
- copy number
- healthcare
- convolutional neural network
- machine learning
- high resolution
- oxidative stress
- artificial intelligence
- molecular docking
- virtual reality
- stem cells
- mesenchymal stem cells
- primary care
- pet ct
- pet imaging