Extending Comet for Global Amino Acid Variant and Post-Translational Modification Analysis Using the PSI Extended FASTA Format.
Jimmy K EngEric W DeutschPublished in: Proteomics (2020)
Protein identification by tandem mass spectrometry sequence database searching is a standard practice in many proteomics laboratories. The de facto standard for the representation of sequence databases used as input to sequence database search tools is the FASTA format. The Human Proteome Organization's Proteomics Standards Initiative has developed an extension to the FASTA format termed the proteomics standards initiative extended FASTA format or PSI extended FASTA format (PEFF) where additional information such as structural annotations are encoded in the protein description lines. Comet has been extended to automatically analyze the post translational modifications and amino acid substitutions encoded in PEFF databases. Comet's PEFF implementation and example analysis results searching a HEK293 dataset against the neXtProt PEFF database are presented.
Keyphrases
- amino acid
- quality improvement
- tandem mass spectrometry
- mass spectrometry
- liquid chromatography
- high performance liquid chromatography
- healthcare
- primary care
- gas chromatography
- ultra high performance liquid chromatography
- endothelial cells
- adverse drug
- simultaneous determination
- label free
- high resolution
- solid phase extraction
- high resolution mass spectrometry
- machine learning
- artificial intelligence
- deep learning
- electronic health record