Accurate Prediction of y Ions in Beam-Type Collision-Induced Dissociation Using Deep Learning.
HyeonSeok ShinYoungmin ParkKyunggeun AhnSungsoo KimPublished in: Analytical chemistry (2022)
Peptide fragmentation spectra contain critical information for the identification of peptides by mass spectrometry. In this study, we developed an algorithm that more accurately predicts the high-intensity peaks among the peptide spectra. The training data are composed of 180,833 peptides from the National Institute of Standards and Technology and Proteomics Identification database, which were fragmented by either quadrupole time-of-flight or triple-quadrupole collision-induced dissociation methods. Exploratory analysis of the peptide fragmentation pattern was focused on the highest intensity peaks that showed proline, peptide length, and a sliding window of four amino acid combination that can be exploited as key features. The amino acid sequence of each peptide and each of the key features were allocated to different layers of the model, where recurrent neural network, convolutional neural network, and fully connected neural network were used. The trained model, PrAI-frag, accurately predicts the fragmentation spectra compared to previous machine learning-based prediction algorithms. The model excels at high-intensity peak prediction, which is advantageous to selective/multiple reaction monitoring application. PrAI-frag is provided via a Web server which can be used for peptides of length 6-15.
Keyphrases
- high intensity
- neural network
- amino acid
- mass spectrometry
- deep learning
- machine learning
- resistance training
- convolutional neural network
- liquid chromatography
- high performance liquid chromatography
- high resolution
- gas chromatography
- artificial intelligence
- big data
- high glucose
- tandem mass spectrometry
- density functional theory
- simultaneous determination
- healthcare
- oxidative stress
- diabetic rats
- quantum dots
- endothelial cells
- health information
- aqueous solution
- electron transfer
- label free
- data analysis