Oktoberfest: Open-source spectral library generation and rescoring pipeline based on Prosit.
Mario PiccianiWassim GabrielVictor-George GiurcoiuOmar ShoumanFiras HamoodLudwig LautenbacherCecilia Bang JensenJulian MüllerMostafa KalhorArmin SoleymaniniyaBernhard KusterMatthew TheMathias WilhelmPublished in: Proteomics (2023)
Machine learning (ML) and deep learning (DL) models for peptide property prediction such as Prosit have enabled the creation of high quality in silico reference libraries. These libraries are used in various applications, ranging from data-independent acquisition (DIA) data analysis to data-driven rescoring of search engine results. Here, we present Oktoberfest, an open source Python package of our spectral library generation and rescoring pipeline originally only available online via ProteomicsDB. Oktoberfest is largely search engine agnostic and provides access to online peptide property predictions, promoting the adoption of state-of-the-art ML/DL models in proteomics analysis pipelines. We demonstrate its ability to reproduce and even improve our results from previously published rescoring analyses on two distinct use cases. Oktoberfest is freely available on GitHub (https://github.com/wilhelm-lab/oktoberfest) and can easily be installed locally through the cross-platform PyPI Python package.
Keyphrases
- data analysis
- machine learning
- deep learning
- optical coherence tomography
- social media
- electronic health record
- health information
- big data
- artificial intelligence
- mass spectrometry
- high throughput
- randomized controlled trial
- magnetic resonance imaging
- magnetic resonance
- computed tomography
- convolutional neural network
- meta analyses
- label free