Benefits of Iterative Searches of Large Databases to Interpret Large Human Gut Metaproteomic Data Sets.
Ariane BassignaniSandra PlancadeMagali BerlandMelisande Blein-NicolasAlain GuillotDidier ChevretChloé MoritzSylvie HuetSalwa RizkallaKarine ClémentJoël DoréOlivier LangellaCatherine JustePublished in: Journal of proteome research (2021)
The gut microbiota are increasingly considered as a main partner of human health. Metaproteomics enables us to move from the functional potential revealed by metagenomics to the functions actually operating in the microbiome. However, metaproteome deciphering remains challenging. In particular, confident interpretation of a myriad of MS/MS spectra can only be pursued with smart database searches. Here, we compare the interpretation of MS/MS data sets from 48 individual human gut microbiomes using three interrogation strategies of the dedicated Integrated nonredundant Gene Catalog (IGC 9.9 million genes from 1267 individual fecal samples) together with the Homo sapiens database: the classical single-step interrogation strategy and two iterative strategies (in either two or three steps) aimed at preselecting a reduced-sized, more targeted search space for the final peptide spectrum matching. Both iterative searches outperformed the single-step classical search in terms of the number of peptides and protein clusters identified and the depth of taxonomic and functional knowledge, and this was the most convincing with the three-step approach. However, iterative searches do not help in reducing variability of repeated analyses, which is inherent to the traditional data-dependent acquisition mode, but this variability did not affect the hierarchical relationship between replicates and all other samples.
Keyphrases
- human health
- ms ms
- image quality
- endothelial cells
- big data
- risk assessment
- electronic health record
- genome wide
- induced pluripotent stem cells
- healthcare
- pluripotent stem cells
- adverse drug
- computed tomography
- amino acid
- copy number
- genome wide identification
- data analysis
- machine learning
- emergency department
- magnetic resonance
- drug delivery
- mass spectrometry
- binding protein
- bioinformatics analysis