DeepICSH: a complex deep learning framework for identifying cell-specific silencers and their strength from the human genome.
Tianjiao ZhangLiangyu LiHailong SunDali XuGuohua WangPublished in: Briefings in bioinformatics (2023)
Silencers are noncoding DNA sequence fragments located on the genome that suppress gene expression. The variation of silencers in specific cells is closely related to gene expression and cancer development. Computational approaches that exclusively rely on DNA sequence information for silencer identification fail to account for the cell specificity of silencers, resulting in diminished accuracy. Despite the discovery of several transcription factors and epigenetic modifications associated with silencers on the genome, there is still no definitive biological signal or combination thereof to fully characterize silencers, posing challenges in selecting suitable biological signals for their identification. Therefore, we propose a sophisticated deep learning framework called DeepICSH, which is based on multiple biological data sources. Specifically, DeepICSH leverages a deep convolutional neural network to automatically capture biologically relevant signal combinations strongly associated with silencers, originating from a diverse array of biological signals. Furthermore, the utilization of attention mechanisms facilitates the scoring and visualization of these signal combinations, whereas the employment of skip connections facilitates the fusion of multilevel sequence features and signal combinations, thereby empowering the accurate identification of silencers within specific cells. Extensive experiments on HepG2 and K562 cell line data sets demonstrate that DeepICSH outperforms state-of-the-art methods in silencer identification. Notably, we introduce for the first time a deep learning framework based on multi-omics data for classifying strong and weak silencers, achieving favorable performance. In conclusion, DeepICSH shows great promise for advancing the study and analysis of silencers in complex diseases. The source code is available at https://github.com/lyli1013/DeepICSH.
Keyphrases
- deep learning
- gene expression
- convolutional neural network
- dna methylation
- induced apoptosis
- big data
- single cell
- electronic health record
- transcription factor
- endothelial cells
- machine learning
- genome wide
- high throughput
- bioinformatics analysis
- healthcare
- small molecule
- cell free
- oxidative stress
- data analysis
- radiation therapy
- mesenchymal stem cells
- mental illness
- health information
- papillary thyroid
- social media
- squamous cell