EMNGly: predicting N-linked glycosylation sites using the language models for feature extraction.
Xiaoyang HouYu WangDongbo BuYaojun WangChuncui HuangPublished in: Bioinformatics (Oxford, England) (2023)
In this context, a new approach called EMNGly has been proposed. The EMNGly approach utilizes pretrained protein language model (Evolutionary Scale Modeling) and pretrained protein structure model (Inverse Folding Model) for features extraction and support vector machine for classification. Ten-fold cross-validation and independent tests show that this approach has outperformed existing techniques. And it achieves Matthews Correlation Coefficient, sensitivity, specificity, and accuracy of 0.8282, 0.9343, 0.8934, and 0.9143, respectively on a benchmark independent test set.