Login / Signup

Increasing reproducibility, robustness, and generalizability of biomarker selection from meta-analysis using Bayesian methodology.

Laurynas KalesinskasSanjana GuptaPurvesh Khatri
Published in: PLoS computational biology (2022)
A major limitation of gene expression biomarker studies is that they are not reproducible as they simply do not generalize to larger, real-world, heterogeneous populations. Frequentist multi-cohort gene expression meta-analysis has been frequently used as a solution to this problem to identify biomarkers that are truly differentially expressed. However, the frequentist meta-analysis framework has its limitations-it needs at least 4-5 datasets with hundreds of samples, is prone to confounding from outliers and relies on multiple-hypothesis corrected p-values. To address these shortcomings, we have created a Bayesian meta-analysis framework for the analysis of gene expression data. Using real-world data from three different diseases, we show that the Bayesian method is more robust to outliers, creates more informative estimates of between-study heterogeneity, reduces the number of false positive and false negative biomarkers and selects more generalizable biomarkers with less data. We have compared the Bayesian framework to a previously published frequentist framework and have developed a publicly available R package for use.
Keyphrases
  • gene expression
  • systematic review
  • meta analyses
  • case control
  • dna methylation
  • big data
  • single cell
  • rna seq
  • deep learning