High-Abundance Protein-Guided Hybrid Spectral Library for Data-Independent Acquisition Metaproteomics.
Enhui WuYi YangJinzhi ZhaoJianxujie ZhengXiaoqing WangChengpin ShenLiang QiaoPublished in: Analytical chemistry (2024)
Metaproteomics offers a direct avenue to identify microbial proteins in microbiota, enabling the compositional and functional characterization of microbiota. Due to the complexity and heterogeneity of microbial communities, in-depth and accurate metaproteomics faces tremendous limitations. One challenge in metaproteomics is the construction of a suitable protein sequence database to interpret the highly complex metaproteomic data, especially in the absence of metagenomic sequencing data. Herein, we present a high-abundance protein-guided hybrid spectral library strategy for in-depth data independent acquisition (DIA) metaproteomic analysis (HAPs-hyblibDIA). A dedicated high-abundance protein database of gut microbial species is constructed and used to mine the taxonomic information on microbiota samples. Then, a sample-specific protein sequence database is built based on the taxonomic information using Uniprot protein sequence for subsequent analysis of the DIA data using hybrid spectral library-based DIA analysis. We evaluated the accuracy and sensitivity of the method using synthetic microbial community samples and human gut microbiome samples. It was demonstrated that the strategy can successfully identify taxonomic compositions of microbiota samples and that the peptides identified by HAPs-hyblibDIA overlapped greatly with the peptides identified using a metagenomic sequencing-derived database. At the peptide and species level, our results can serve as a complement to the results obtained using a metagenomic sequencing-derived database. Furthermore, we validated the applicability of the HAPs-hyblibDIA strategy in a cohort of human gut microbiota samples of colorectal cancer patients and controls, highlighting its usability in biomedical research.
Keyphrases
- microbial community
- antibiotic resistance genes
- amino acid
- electronic health record
- optical coherence tomography
- protein protein
- single cell
- binding protein
- emergency department
- adverse drug
- wastewater treatment
- computed tomography
- machine learning
- healthcare
- high resolution
- data analysis
- mass spectrometry
- health information
- deep learning
- dual energy
- artificial intelligence