Imputation of Below Detection Limit Missing Data in Chemical Mixture Analysis with Bayesian Group Index Regression.
Matthew CarliMary H WardCatherine MetayerDavid C WheelerPublished in: International journal of environmental research and public health (2022)
There is growing scientific interest in identifying the multitude of chemical exposures related to human diseases through mixture analysis. In this paper, we address the issue of below detection limit (BDL) missing data in mixture analysis using Bayesian group index regression by treating both regression effects and missing BDL observations as parameters in a model estimated through a Markov chain Monte Carlo algorithm that we refer to as pseudo-Gibbs imputation. We compare this with other Bayesian imputation methods found in the literature (Multiple Imputation by Chained Equations and Sequential Full Bayes imputation) as well as with a non-Bayesian single-imputation method. To evaluate our proposed method, we conduct simulation studies with varying percentages of BDL missingness and strengths of association. We apply our method to the California Childhood Leukemia Study (CCLS) to estimate concentrations of chemicals in house dust in a mixture analysis of potential environmental risk factors for childhood leukemia. Our results indicate that pseudo-Gibbs imputation has superior power for exposure effects and sensitivity for identifying individual chemicals at high percentages of BDL missing data. In the CCLS, we found a significant positive association between concentrations of polycyclic aromatic hydrocarbons (PAHs) in homes and childhood leukemia as well as significant positive associations for polychlorinated biphenyls (PCBs) and herbicides among children from the highest quartile of household income. In conclusion, pseudo-Gibbs imputation addresses a commonly encountered problem in environmental epidemiology, providing practitioners the ability to jointly estimate the effects of multiple chemical exposures with high levels of BDL missingness.
Keyphrases
- human health
- acute myeloid leukemia
- electronic health record
- bone marrow
- polycyclic aromatic hydrocarbons
- primary care
- endothelial cells
- air pollution
- monte carlo
- deep learning
- mental health
- young adults
- risk factors
- childhood cancer
- data analysis
- label free
- loop mediated isothermal amplification
- induced pluripotent stem cells
- life cycle
- drinking water
- pluripotent stem cells
- case control
- long term care