Login / Signup

A Zero-inflated Beta-binomial Model for Microbiome Data Analysis.

Tao HuPaul GallinsYi-Hui Zhou
Published in: Stat (International Statistical Institute) (2018)
The microbiome is increasingly recognized as an important aspect of the health of host species, involved in many biological pathways and processes and potentially useful as health biomarkers. Taking advantage of high-throughput sequencing technologies, modern bacterial microbiome studies are metagenomic, interrogating thousands of taxa simultaneously. Several data analysis frameworks have been proposed for microbiome sequence read count data and determining the most significant features. However, there is still room for improvement. We introduce a zero-inflated beta-binomial (ZIBB) to model the distribution of microbiome count data and to determine association with a continuous or categorical phenotype of interest. The approach can exploit mean-variance relationships to improve power and adjust for covariates. The proposed method is a mixture model with two components: (i) a zero model accounting for excess zeros and (ii) a count model to capture the remaining component by beta-binomial regression, allowing for overdispersion effects. Simulation studies show that our proposed method effectively controls type I error and has higher power than competing methods to detect taxa associated with phenotype. An R package ZIBBSeqDiscovery is available on R CRAN.
Keyphrases
  • data analysis
  • healthcare
  • mental health
  • electronic health record
  • big data
  • climate change
  • case control
  • microbial community
  • human health