LectinOracle: A Generalizable Deep Learning Model for Lectin-Glycan Binding Prediction.
Jon LundstrømEmma KorhonenFrédérique LisacekDaniel BojarPublished in: Advanced science (Weinheim, Baden-Wurttemberg, Germany) (2021)
Ranging from bacterial cell adhesion over viral cell entry to human innate immunity, glycan-binding proteins or lectins are abound in nature. Widely used as staining and characterization reagents in cell biology and crucial for understanding the interactions in biological systems, lectins are a focal point of study in glycobiology. Yet the sheer breadth and depth of specificity for diverse oligosaccharide motifs has made studying lectins a largely piecemeal approach, with few options to generalize. Here, LectinOracle, a model combining transformer-based representations for proteins and graph convolutional neural networks for glycans to predict their interaction, is presented. Using a curated data set of 564,647 unique protein-glycan interactions, it is shown that LectinOracle predictions agree with literature-annotated specificities for a wide range of lectins. Using a range of specialized glycan arrays, it is shown that LectinOracle predictions generalize to new glycans and lectins, with qualitative and quantitative agreement with experimental data. It is further demonstrated that LectinOracle can be used to improve lectin classification, accelerate lectin directed evolution, predict epidemiological outcomes in the context of influenza virus, and analyze whole lectomes in host-microbe interactions. It is envisioned that the herein presented platform will advance both the study of lectins and their role in (glyco)biology.
Keyphrases
- deep learning
- convolutional neural network
- cell surface
- systematic review
- single cell
- cell adhesion
- machine learning
- electronic health record
- artificial intelligence
- big data
- sars cov
- palliative care
- high throughput
- high resolution
- working memory
- mesenchymal stem cells
- small molecule
- metabolic syndrome
- weight loss
- optical coherence tomography
- induced pluripotent stem cells
- bone marrow