Login / Signup

GPSuc: Global Prediction of Generic and Species-specific Succinylation Sites by aggregating multiple sequence features.

Md Mehedi HasanHiroyuki Kurata
Published in: PloS one (2018)
Lysine succinylation is one of the dominant post-translational modification of the protein that contributes to many biological processes including cell cycle, growth and signal transduction pathways. Identification of succinylation sites is an important step for understanding the function of proteins. The complicated sequence patterns of protein succinylation revealed by proteomic studies highlight the necessity of developing effective species-specific in silico strategies for global prediction succinylation sites. Here we have developed the generic and nine species-specific succinylation site classifiers through aggregating multiple complementary features. We optimized the consecutive features using the Wilcoxon-rank feature selection scheme. The final feature vectors were trained by a random forest (RF) classifier. With an integration of RF scores via logistic regression, the resulting predictor termed GPSuc achieved better performance than other existing generic and species-specific succinylation site predictors. To reveal the mechanism of succinylation and assist hypothesis-driven experimental design, our predictor serves as a valuable resource. To provide a promising performance in large-scale datasets, a web application was developed at http://kurata14.bio.kyutech.ac.jp/GPSuc/.
Keyphrases
  • cell cycle
  • machine learning
  • amino acid
  • climate change
  • deep learning
  • protein protein
  • small molecule
  • dna methylation
  • genome wide
  • single cell
  • rna seq
  • body composition