Login / Signup

Support Interval for Two-Sample Summary Data-Based Mendelian Randomization.

Kai Wang
Published in: Genes (2023)
The summary-data-based Mendelian randomization (SMR) method is gaining popularity in estimating the causal effect of an exposure on an outcome. In practice, the instrument SNP is often selected from the genome-wide association study (GWAS) on the exposure but no correction is made for such selection in downstream analysis, leading to a biased estimate of the effect size and invalid inference. We address this issue by using the likelihood derived from the sampling distribution of the estimated SNP effects in the exposure GWAS and the outcome GWAS. This likelihood takes into account how the instrument SNPs are selected. Since the effective sample size is 1, the asymptotic theory does not apply. We use a support for a profile likelihood as an interval estimate of the causal effect. Simulation studies indicate that this support has robust coverage while the confidence interval implied by the SMR method has lower-than-nominal coverage. Furthermore, the variance of the two-stage least squares estimate of the causal effect is shown to be the same as the variance used for SMR for one-sample data when there is no selection.
Keyphrases
  • genome wide association study
  • genome wide
  • electronic health record
  • big data
  • healthcare
  • machine learning
  • data analysis
  • deep learning
  • artificial intelligence
  • patient reported outcomes