Login / Signup

Integrative analysis of microbial 16S gene and shotgun metagenomic sequencing data improves statistical efficiency.

Ye YueTimothy D ReadVeronika FedirkoGlen A SattenYi-Juan Hu
Published in: bioRxiv : the preprint server for biology (2023)
The most widely used technologies for profiling microbial communities are 16S marker-gene sequencing and shotgun metagenomic sequencing. Interestingly, many microbiome studies have performed both sequencing experiments on the same cohort of samples. The two sequencing datasets often reveal consistent patterns of microbial signatures, highlighting the potential for an integrative analysis to improve power of testing these signatures. However, differential experimental biases, partially overlapping samples, and differential library sizes pose tremendous challenges when combining the two datasets. Currently, researchers either discard one dataset entirely or use different datasets for different objectives. In this article, we introduce the first method of this kind, named Com-2seq, that combines the two sequencing datasets for the objective of testing differential abundance at the genus and community levels while overcoming these difficulties. We demonstrate that Com-2seq substantially improves statistical efficiency over analysis of either dataset alone and works better than two ad hoc approaches.
Keyphrases
  • single cell
  • rna seq
  • genome wide
  • microbial community
  • healthcare
  • antibiotic resistance genes
  • machine learning
  • electronic health record
  • gene expression
  • mental health
  • big data
  • deep learning
  • anaerobic digestion