A Handle on Mass Coincidence Errors in De Novo Sequencing of Antibodies by Bottom-up Proteomics.
Douwe SchulteJoost SnijderPublished in: Journal of proteome research (2024)
Antibody sequences can be determined at 99% accuracy directly from the polypeptide product by using bottom-up proteomics techniques. Sequencing accuracy at the peptide level is limited by the isobaric residues leucine and isoleucine, incomplete fragmentation spectra in which the order of two or more residues remains ambiguous due to lacking fragment ions for the intermediate positions, and isobaric combinations of amino acids, of potentially different lengths, for example, GG = N and GA = Q. Here, we present several updates to Stitch (v1.5), which performs template-based assembly of de novo peptides to reconstruct antibody sequences. This version introduces a mass-based alignment algorithm that explicitly accounts for mass coincidence errors. In addition, it incorporates a postprocessing procedure to assign I/L residues based on secondary fragments (satellite ions, i.e. , w- ions). Moreover, evidence for sequence assignments can now be directly evaluated with the addition of an integrated spectrum viewer. Lastly, input data from a wider selection of de novo peptide sequencing algorithms are allowed, now including Casanovo, PEAKS, Novor.Cloud, pNovo, and MaxNovo, in addition to flat text and FASTA. Combined, these changes make Stitch compatible with a larger range of data processing pipelines and improve its tolerance to peptide-level sequencing errors.
Keyphrases
- single cell
- amino acid
- machine learning
- quantum dots
- mass spectrometry
- patient safety
- electronic health record
- adverse drug
- deep learning
- big data
- aqueous solution
- emergency department
- minimally invasive
- water soluble
- density functional theory
- genetic diversity
- psychometric properties
- molecular dynamics
- quality improvement