Benchmarking mass spectrometry based proteomics algorithms using a simulated database.
Muaaz Gul AwanAbdullah Gul AwanFahad SaeedPublished in: Network modeling and analysis in health informatics and bioinformatics (2021)
Protein sequencing algorithms process data from a variety of instruments that has been generated under diverse experimental conditions. Currently there is no way to predict the accuracy of an algorithm for a given data set. Most of the published algorithms and associated software has been evaluated on limited number of experimental data sets. However, these performance evaluations do not cover the complete search space the algorithmand the software might encounter in real-world. To this end, we present a database of simulated spectra that can be used to benchmark any spectra to peptide search engine. We demonstrate the usability of this database by bench marking two popular peptide sequencing engines. We show wide variation in the accuracy of peptide deductions and a complete quality profile of a given algorithm can be useful for practitioners and algorithm developers. All benchmarking data is available at https://users.cs.fiu.edu/~fsaeed/Benchmark.html.
Keyphrases
- machine learning
- electronic health record
- deep learning
- big data
- mass spectrometry
- artificial intelligence
- adverse drug
- emergency department
- systematic review
- healthcare
- neural network
- quality improvement
- high performance liquid chromatography
- health information
- ms ms
- simultaneous determination
- gas chromatography
- meta analyses