Login / Signup

ASRpro: A machine-learning computational model for identifying proteins associated with multiple abiotic stress in plants.

Prabina Kumar MeherTanmaya Kumar SahuAjit GuptaAnuj KumarSachin Rustgi
Published in: The plant genome (2022)
One of the thrust areas of research in plant breeding is to develop crop cultivars with enhanced tolerance to abiotic stresses. Thus, identifying abiotic stress-responsive genes (SRGs) and proteins is important for plant breeding research. However, identifying such genes via established genetic approaches is laborious and resource intensive. Although transcriptome profiling has remained a reliable method of SRG identification, it is species specific. Additionally, identifying multistress responsive genes using gene expression studies is cumbersome. Thus, endorsing the need to develop a computational method for identifying the genes associated with different abiotic stresses. In this work, we aimed to develop a computational model for identifying genes responsive to six abiotic stresses: cold, drought, heat, light, oxidative, and salt. The predictions were performed using support vector machine (SVM), random forest, adaptive boosting (ADB), and extreme gradient boosting (XGB), where the autocross covariance (ACC) and K-mer compositional features were used as input. With ACC, K-mer, and ACC + K-mer compositional features, the overall accuracy of ∼60-77, ∼75-86, and ∼61-78% were respectively obtained using the SVM algorithm with fivefold cross-validation. The SVM also achieved higher accuracy than the other three algorithms. The proposed model was also assessed with an independent dataset and obtained an accuracy consistent with cross-validation. The proposed model is the first of its kind and is expected to serve the requirement of experimental biologists; however, the prediction accuracy was modest. Given its importance for the research community, the online prediction application, ASRpro, is made freely available (https://iasri-sg.icar.gov.in/asrpro/) for predicting abiotic SRGs and proteins.
Keyphrases