Login / Signup

Causal data fusion methods using summary-level statistics for a continuous outcome.

Hongkai LiWang MiaoZheng CaiXinhui LiuTao ZhangFuzhong XueZhi Geng
Published in: Statistics in medicine (2020)
In many empirical studies, there exist rich individual studies to separately estimate causal effect of the treatment or exposure variable on the outcome variable, but incomplete confounders are adjusted in each study. Suppose we are interested in the causal effect of a treatment or exposure on an outcome variable, and we have available rich datasets that contain different confounders. How to integrate summary-level statistics from multiple individual datasets to improve causal inference has become a main challenge in data fusion. We propose a novel method in this article to identify the causal effect of a treatment or exposure on the continuous outcome. We show that the causal effect is identifiable and can be estimated by combining summary-level statistics from multiple datasets containing subsets of confounders and an external dataset only containing complete confounding information. Simulation studies indicate the unbiasedness of causal effect estimate by our method and we apply our method to a study about the effect of body mass index on fasting blood glucose.
Keyphrases
  • blood glucose
  • body mass index
  • electronic health record
  • rna seq
  • big data
  • blood pressure
  • combination therapy
  • skeletal muscle
  • machine learning
  • single cell
  • deep learning
  • smoking cessation