Login / Signup

A Grid Search-Based Multilayer Dynamic Ensemble System to Identify DNA N4-Methylcytosine Using Deep Learning Approach.

Rajib Kumar HalderMohammed Nasir UddinMd Ashraf UddinSunil AryalMd Aminul IslamFahima HossainNusrat JahanAnsam KhraisatAmmar Alazab
Published in: Genes (2023)
DNA (Deoxyribonucleic Acid) N4-methylcytosine (4mC), a kind of epigenetic modification of DNA, is important for modifying gene functions, such as protein interactions, conformation, and stability in DNA, as well as for the control of gene expression throughout cell development and genomic imprinting. This simply plays a crucial role in the restriction-modification system. To further understand the function and regulation mechanism of 4mC, it is essential to precisely locate the 4mC site and detect its chromosomal distribution. This research aims to design an efficient and high-throughput discriminative intelligent computational system using the natural language processing method "word2vec" and a multi-configured 1D convolution neural network (1D CNN) to predict 4mC sites. In this article, we propose a grid search-based multi-layer dynamic ensemble system (GS-MLDS) that can enhance existing knowledge of each level. Each layer uses a grid search-based weight searching approach to find the optimal accuracy while minimizing computation time and additional layers. We have used eight publicly available benchmark datasets collected from different sources to test the proposed model's efficiency. Accuracy results in test operations were obtained as follows: 0.978, 0.954, 0.944, 0.961, 0.950, 0.973, 0.948, 0.952, 0.961, and 0.980. The proposed model has also been compared to 16 distinct models, indicating that it can accurately predict 4mC.
Keyphrases