SABER: Statistical Identification of Loci of Interest in GWAS Summary Statistics using a Bayesian Gaussian Mixture Model.
Rachit KumarRasika VenkateshMarylyn D RitchiePublished in: AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science (2024)
Genome-wide association studies (GWAS) remain a popular method for identifying novel genetic associations with human phenotypes and have provided many insights into the etiology of many diseases. However, GWAS provide limited support for how a genetic association might contribute to disease due to inherent limitations, such as linkage disequilibrium. As such, many methods that operate on GWAS summary statistics have been developed to generate evidence for functional pathways or for variants of interest, but they require defining the genomic region bounds for loci of interest. At present, there are limited methods for determining these bounds in a rigorous, reproducible way. We present a novel statistical method, Statistical Analysis for Bayesian Estimation of Regions (SABER), that uses Bayesian Gaussian mixture models to reproducibly generate ratios that quantify whether particular genomic positions represent the bounds of loci of interest and can be used to delineate genomic regions for downstream analyses.