Propensity score analysis for a semi-continuous exposure variable: a study of gestational alcohol exposure and childhood cognition.
Tugba Akkaya HocagilRichard J CookSandra W JacobsonJoseph L JacobsonLouise M RyanPublished in: Journal of the Royal Statistical Society. Series A, (Statistics in Society) (2021)
Propensity score methodology has become increasingly popular in recent years as a tool for estimating causal effects in observational studies. Much of the related research has been directed at settings with binary or discrete exposure variables with more recent work involving continuous exposure variables. In environmental epidemiology, a substantial proportion of individuals is often completely unexposed while others may experience heavy exposure leading to an exposure distribution with a point mass at zero and a heavy right tail. We suggest a new approach to handle this type of exposure data by constructing a propensity score based on a two-part model and show how this model can be used to more reliably adjust for covariates of a semi-continuous exposure variable. We also consider the case when a misspecified propensity score is used in a regression adjustment and derive an explicit form of the bias. We show that the potential bias gets smaller as the estimated propensity score gets closer to the true expectation of the exposure variable given a set of observed covariates. While this result pertains to a more general setting, we use it to evaluate the potential bias in settings in which the true exposure has a semi-continuous structure. We also evaluate and compare the performance of our proposed method through simulation studies relative to a simpler linear regression-based propensity score for a continuous exposure variable as well as through direct covariate adjustment. Overall, we find that using a propensity score constructed via a two-part model significantly improves the regression estimate when the exposure variable is semi-continuous in nature. Specifically when the proportion of non-exposed subjects is high and the effects of covariates on exposure and outcome are strong, the proposed two-part propensity score method outperforms the more standard competing methods. We illustrate our method using data from the Detroit Longitudinal Cohort Study in which the exposure variable reflects gestational alcohol exposure featuring zero values and a long tail.