Login / Signup

Toward Automatic Inference of Glycan Linkages Using MS n and Machine Learning─Proof of Concept Using Sialic Acid Linkages.

Xinyi NiNathan B MurrayStephanie Archer-HartmannLauren E PepiRichard F HelmParastoo AzadiPengyu Hong
Published in: Journal of the American Society for Mass Spectrometry (2023)
Glycosidic linkages in oligosaccharides play essential roles in determining their chemical properties and biological activities. MS n has been widely used to infer glycosidic linkages but requires a substantial amount of starting material, which limits its application. In addition, there is a lack of rigorous research on what MS n protocols are proper for characterizing glycosidic linkages. In this work, to deliver high-quality experimental data and analysis results, we propose a machine learning-based framework to establish appropriate MS n protocols and build effective data analysis methods. We demonstrate the proof-of-principle by applying our approach to elucidate sialic acid linkages (α2'-3' and α2'-6') in a set of sialyllactose standards and NIST sialic acid-containing N-glycans as well as identify several protocol configurations for producing high-quality experimental data. Our companion data analysis method achieves nearly 100% accuracy in classifying α2'-3' vs α2'-6' using MS 5 , MS 4 , MS 3 , or even MS 2 spectra alone. The ability to determine glycosidic linkages using MS 2 or MS 3 is significant as it requires substantially less sample, enabling linkage analysis for quantity-limited natural glycans and synthesized materials, as well as shortens the overall experimental time. MS 2 is also more amenable than MS 3/4/5 to automation when coupled to direct infusion or LC-MS. Additionally, our method can predict the ratio of α2'-3' and α2'-6' in a mixture with 8.6% RMSE (root-mean-square error) across data sets using MS 5 spectra. We anticipate that our framework will be generally applicable to analysis of other glycosidic linkages.
Keyphrases