Software for Peak Finding and Elemental Composition Assignment for Glycosaminoglycan Tandem Mass Spectra.
John D HoganJoshua A KleinJiandong WuPradeep ChopraGeert-Jan BoonsLuis CarvalhoCheng LinJoseph ZaiaPublished in: Molecular & cellular proteomics : MCP (2018)
Glycosaminoglycans (GAGs) covalently linked to proteoglycans (PGs) are characterized by repeating disaccharide units and variable sulfation patterns along the chain. GAG length and sulfation patterns impact disease etiology, cellular signaling, and structural support for cells. We and others have demonstrated the usefulness of tandem mass spectrometry (MS2) for assigning the structures of GAG saccharides; however, manual interpretation of tandem mass spectra is time-consuming, so computational methods must be employed. In the proteomics domain, the identification of monoisotopic peaks and charge states relies on algorithms that use averagine, or the average building block of the compound class being analyzed. Although these methods perform well for protein and peptide spectra, they perform poorly on GAG tandem mass spectra, because a single average building block does not characterize the variable sulfation of GAG disaccharide units. In addition, it is necessary to assign product ion isotope patterns to interpret the tandem mass spectra of GAG saccharides. To address these problems, we developed GAGfinder, the first tandem mass spectrum peak finding algorithm developed specifically for GAGs. We define peak finding as assigning experimental isotopic peaks directly to a given product ion composition, as opposed to deconvolution or peak picking, which are terms more accurately describing the existing methods previously mentioned. GAGfinder is a targeted, brute force approach to spectrum analysis that uses precursor composition information to generate all theoretical fragments. GAGfinder also performs peak isotope composition annotation, which is typically a subsequent step for averagine-based methods. Data are available via ProteomeXchange with identifier PXD009101.
Keyphrases
- tandem mass spectrometry
- density functional theory
- mass spectrometry
- machine learning
- liquid chromatography
- mental health
- high resolution
- high performance liquid chromatography
- deep learning
- induced apoptosis
- multiple sclerosis
- ultra high performance liquid chromatography
- healthcare
- signaling pathway
- small molecule
- electronic health record
- cell proliferation
- cancer therapy
- single molecule
- health information
- endoplasmic reticulum stress
- big data
- cell death
- drug delivery
- cell cycle arrest
- social media
- label free
- amino acid