Login / Signup

m5C-Seq: Machine learning-enhanced profiling of RNA 5-methylcytosine modifications.

Zeeshan AbbasMobeen Ur RehmanHilal TayaraSeung Won LeeKil To Chong
Published in: Computers in biology and medicine (2024)
Epigenetic modifications, particularly RNA methylation and histone alterations, play a crucial role in heredity, development, and disease. Among these, RNA 5-methylcytosine (m5C) is the most prevalent RNA modification in mammalian cells, essential for processes such as ribosome synthesis, translational fidelity, mRNA nuclear export, turnover, and translation. The increasing volume of nucleotide sequences has led to the development of machine learning-based predictors for m5C site prediction. However, these predictors often face challenges related to training data limitations and overfitting due to insufficient external validation. This study introduces m5C-Seq, an ensemble learning approach for RNA modification profiling, designed to address these issues. m5C-Seq employs a meta-classifier that integrates 15 probabilities generated from a novel, large dataset using systematic encoding methods to make final predictions. Demonstrating superior performance compared to existing predictors, m5C-Seq represents a significant advancement in accurate RNA modification profiling. The code and the newly established datasets are made available through GitHub at https://github.com/Z-Abbas/m5C-Seq.
Keyphrases
  • single cell
  • rna seq
  • genome wide
  • machine learning
  • dna methylation
  • nucleic acid
  • gene expression
  • big data
  • mass spectrometry
  • body composition
  • high throughput sequencing