A modeling framework for detecting and leveraging node-level information in Bayesian network inference.
Xiaoyue XiHélène RuffieuxPublished in: Biostatistics (Oxford, England) (2024)
Bayesian graphical models are powerful tools to infer complex relationships in high dimension, yet are often fraught with computational and statistical challenges. If exploited in a principled way, the increasing information collected alongside the data of primary interest constitutes an opportunity to mitigate these difficulties by guiding the detection of dependence structures. For instance, gene network inference may be informed by the use of publicly available summary statistics on the regulation of genes by genetic variants. Here we present a novel Gaussian graphical modeling framework to identify and leverage information on the centrality of nodes in conditional independence graphs. Specifically, we consider a fully joint hierarchical model to simultaneously infer (i) sparse precision matrices and (ii) the relevance of node-level information for uncovering the sought-after network structure. We encode such information as candidate auxiliary variables using a spike-and-slab submodel on the propensity of nodes to be hubs, which allows hypothesis-free selection and interpretation of a sparse subset of relevant variables. As efficient exploration of large posterior spaces is needed for real-world applications, we develop a variational expectation conditional maximization algorithm that scales inference to hundreds of samples, nodes and auxiliary variables. We illustrate and exploit the advantages of our approach in simulations and in a gene network study which identifies hub genes involved in biological pathways relevant to immune-mediated diseases.
Keyphrases
- genome wide
- health information
- sentinel lymph node
- single cell
- lymph node
- genome wide identification
- network analysis
- copy number
- machine learning
- healthcare
- electronic health record
- molecular dynamics
- radiation therapy
- social media
- neural network
- genome wide analysis
- gene expression
- rectal cancer
- data analysis
- sensitive detection
- locally advanced