Bayesian causal inference for observational studies with missingness in covariates and outcomes.
Huaiyu ZangHang J KimBin HuangRhonda D SzczesniakPublished in: Biometrics (2023)
Missing data are a pervasive issue in observational studies using electronic health records or patient registries. It presents unique challenges for statistical inference, especially causal inference. Inappropriately handling missing data in causal inference could potentially bias causal estimation. Besides missing data problems, observational health data structures typically have mixed-type variables - continuous and categorical covariates - whose joint distribution is often too complex to be modeled by simple parametric models. The existence of missing values in covariates and outcomes makes the causal inference even more challenging, while most standard causal inference approaches assume fully observed data or start their works after imputing missing values in a separate preprocessing stage. To address these problems, we introduce a Bayesian nonparametric causal model to estimate causal effects with missing data. The proposed approach can simultaneously impute missing values, account for multiple outcomes, and estimate causal effects under the potential outcomes framework. We provide three simulation studies to show the performance of our proposed method under complicated data settings whose features are similar to our case studies. For example, Simulation Study 3 assumes the case where missing values exist in both outcomes and covariates. Two case studies were conducted applying our method to evaluate the comparative effectiveness of treatments for chronic disease management in juvenile idiopathic arthritis and cystic fibrosis.
Keyphrases
- electronic health record
- big data
- cystic fibrosis
- single cell
- mental health
- healthcare
- juvenile idiopathic arthritis
- public health
- clinical decision support
- metabolic syndrome
- adipose tissue
- systemic lupus erythematosus
- high resolution
- pseudomonas aeruginosa
- mass spectrometry
- artificial intelligence
- machine learning
- air pollution
- case report
- disease activity
- human health
- lung function