MRSL: a causal network pruning algorithm based on GWAS summary data.
Lei HouZhi GengZhongshang YuanXu ShiChuan WangFeng ChenHongkai LiXiaokang JiPublished in: Briefings in bioinformatics (2024)
Causal discovery is a powerful tool to disclose underlying structures by analyzing purely observational data. Genetic variants can provide useful complementary information for structure learning. Recently, Mendelian randomization (MR) studies have provided abundant marginal causal relationships of traits. Here, we propose a causal network pruning algorithm MRSL (MR-based structure learning algorithm) based on these marginal causal relationships. MRSL combines the graph theory with multivariable MR to learn the conditional causal structure using only genome-wide association analyses (GWAS) summary statistics. Specifically, MRSL utilizes topological sorting to improve the precision of structure learning. It proposes MR-separation instead of d-separation and three candidates of sufficient separating set for MR-separation. The results of simulations revealed that MRSL had up to 2-fold higher F1 score and 100 times faster computing time than other eight competitive methods. Furthermore, we applied MRSL to 26 biomarkers and 44 International Classification of Diseases 10 (ICD10)-defined diseases using GWAS summary data from UK Biobank. The results cover most of the expected causal links that have biological interpretations and several new links supported by clinical case reports or previous observational literatures.
Keyphrases
- machine learning
- deep learning
- contrast enhanced
- magnetic resonance
- electronic health record
- big data
- cross sectional
- magnetic resonance imaging
- computed tomography
- genome wide association
- healthcare
- high throughput
- convolutional neural network
- molecular dynamics
- data analysis
- mass spectrometry
- health information
- case report