Login / Signup

Structure-sequence features based prediction of phosphosites of serine/threonine protein kinases of Mycobacterium tuberculosis.

Vipul V NilkanthShekhar C Mande
Published in: Proteins (2021)
Elucidation of signaling events in a pathogen is potentially important to tackle the infection caused by it. Such events mediated by protein phosphorylation play important roles in infection, and therefore, to predict the phosphosites and substrates of the serine/threonine protein kinases, we have developed a Machine learning-based approach for Mycobacterium tuberculosis serine/threonine protein kinases using kinase-peptide structure-sequence data. This approach utilizes features derived from kinase three-dimensional-structure environment and known phosphosite sequences to generate support vector machine (SVM)-based kinase-specific predictions of phosphosites of serine/threonine protein kinases (STPKs) with no or scarce data of their substrates. SVM outperformed the four machine learning algorithms we tried (random forest, logistic regression, SVM, and k-nearest neighbors) with an area under the curve receiver-operating characteristic value of 0.88 on the independent testing dataset and a 10-fold cross-validation accuracy of ~81.6% for the final model. Our predicted phosphosites of M. tuberculosis STPKs form a useful resource for experimental biologists enabling elucidation of STPK mediated posttranslational regulation of important cellular processes.
Keyphrases