Login / Signup

O-GlcNAcAtlas: A database of experimentally identified O-GlcNAc sites and proteins.

Junfeng MaYaoxiang LiChunyan HouCi Wu
Published in: Glycobiology (2022)
O-linked β-N-acetylglucosamine (O-GlcNAc) is a post-translational modification (i.e., O-GlcNAcylation) on the serine/threonine residues of proteins. As a unique intracellular monosaccharide modification, protein O-GlcNAcylation plays important roles in almost all biochemical processes examined. Aberrant O-GlcNAcylation underlies the etiologies of a number of chronic diseases. With the tremendous improvement of techniques, thousands of proteins along with their O-GlcNAc sites have been reported. However, until now, there are few databases dedicated to accommodate the rapid accumulation of such information. Thus, O-GlcNAcAtlas is created to integrate all experimentally identified O-GlcNAc sites and proteins. O-GlcNAcAtlas consists of two datasets (Dataset-I and Dataset-II, for unambiguously identified sites and ambiguously identified sites, respectively), representing a total number of 4571 O-GlcNAc modified proteins from all species studied from 1984 to 31 Dec 2019. For each protein, comprehensive information (including species, sample type, gene symbol, modified peptides and/or modification sites, site mapping methods and literature references) is provided. To solve the heterogeneity among the data collected from different sources, the sequence identity of these reported O-GlcNAc peptides are mapped to the UniProtKB protein entries. To our knowledge, O-GlcNAcAtlas is a highly comprehensive and rigorously curated database encapsulating all O-GlcNAc sites and proteins identified in the past 35 years. We expect that O-GlcNAcAtlas will be a useful resource to facilitate O-GlcNAc studies and computational analyses of protein O-GlcNAcylation. The public version of the web interface to the O-GlcNAcAtlas can be found at http://oglcnac.org/.
Keyphrases
  • amino acid
  • protein protein
  • healthcare
  • systematic review
  • binding protein
  • gene expression
  • adverse drug
  • big data
  • machine learning
  • small molecule
  • single cell
  • protein kinase
  • data analysis