Identification of microRNA precursors using reduced and hybrid features.
Asad KhanSajid ShahFazli WahidFiaz Gul KhanSaima JabeenPublished in: Molecular bioSystems (2018)
MicroRNAs (also called miRNAs) are a group of short non-coding RNA molecules. They play a vital role in the gene expression of transcriptional and post-transcriptional processes. However, abnormality of their expression has been observed in cancer, heart diseases and nervous system disorders. Therefore for basic research and microRNA based therapy, it is imperative to separate real pre-miRNAs from false ones (hairpin sequences similar to pre-miRNA stem loops). Different conservation and machine learning methods have been applied for the identification of miRNAs. However, machine learning algorithms have gained more popularity than conservative based algorithms in terms of sensitivity and overall performance. Due to the avalanche of RNA sequences discovered in a post-genomic age, it is necessary to construct a predictor for the identification of pre-microRNAs in humans. We have developed a predictor called MicroR-Pred in which the RNA sequences are formulated by a hybrid feature vector. The novelty of the new predictor is in the use of the partial least squares technique followed by the Random Forest and SVM (Support Vector Machine) algorithms for dimension reduction and classification. The performance of the MicroR-Pred model is quite promising compared to other state-of-the-art miRNA predictors. It has achieved 88.40% and 93.90% accuracies for RF and SVM.
Keyphrases
- machine learning
- gene expression
- deep learning
- artificial intelligence
- big data
- bioinformatics analysis
- heart failure
- poor prognosis
- transcription factor
- dna methylation
- climate change
- papillary thyroid
- heat shock
- mesenchymal stem cells
- genetic diversity
- bone marrow
- copy number
- young adults
- lymph node metastasis
- heat shock protein