Login / Signup

An integrated Bayesian framework for multi-omics prediction and classification.

Himel MallickAnupreet PorwalSatabdi SahaPiyali BasakVladimir SvetnikErina Paul
Published in: Statistics in medicine (2023)
With the growing commonality of multi-omics datasets, there is now increasing evidence that integrated omics profiles lead to more efficient discovery of clinically actionable biomarkers that enable better disease outcome prediction and patient stratification. Several methods exist to perform host phenotype prediction from cross-sectional, single-omics data modalities but decentralized frameworks that jointly analyze multiple time-dependent omics data to highlight the integrative and dynamic impact of repeatedly measured biomarkers are currently limited. In this article, we propose a novel Bayesian ensemble method to consolidate prediction by combining information across several longitudinal and cross-sectional omics data layers. Unlike existing frequentist paradigms, our approach enables uncertainty quantification in prediction as well as interval estimation for a variety of quantities of interest based on posterior summaries. We apply our method to four published multi-omics datasets and demonstrate that it recapitulates known biology in addition to providing novel insights while also outperforming existing methods in estimation, prediction, and uncertainty quantification. Our open-source software is publicly available at https://github.com/himelmallick/IntegratedLearner.
Keyphrases
  • single cell
  • cross sectional
  • rna seq
  • electronic health record
  • big data
  • machine learning
  • systematic review
  • deep learning