Login / Signup

Borrowing from supplemental sources to estimate causal effects from a primary data source.

Jeffrey A BoatmanDavid M VockJoseph Stephen Koopmeiners
Published in: Statistics in medicine (2021)
The increasing multiplicity of data sources offers exciting possibilities in estimating the effects of a treatment, intervention, or exposure, particularly if observational and experimental sources could be used simultaneously. Borrowing between sources can potentially result in more efficient estimators, but it must be done in a principled manner to mitigate increased bias and Type I error. Furthermore, when the effect of treatment is confounded, as in observational sources or in clinical trials with noncompliance, causal effect estimators are needed to simultaneously adjust for confounding and to estimate effects across data sources. We consider the problem of estimating causal effects from a primary source and borrowing from any number of supplemental sources. We propose using regression-based estimators that borrow based on assuming exchangeability of the regression coefficients and parameters between data sources. Borrowing is accomplished with multisource exchangeability models and Bayesian model averaging. We show via simulation that a Bayesian linear model and Bayesian additive regression trees both have desirable properties and borrow under appropriate circumstances. We apply the estimators to recently completed trials of very low nicotine content cigarettes investigating their impact on smoking behavior.
Keyphrases
  • drinking water
  • clinical trial
  • electronic health record
  • randomized controlled trial
  • big data
  • smoking cessation
  • cross sectional
  • combination therapy
  • open label