Login / Signup

Multivariate Air Pollution Prediction Modeling with partial Missingness.

R M BoazA B LawsonJ L Pearce
Published in: Environmetrics (2019)
Missing observations from air pollution monitoring networks have posed a longstanding problem for health investigators of air pollution. Growing interest in mixtures of air pollutants has further complicated this problem, as many new challenges have arisen that require development of novel methods. The objective of this study is to develop a methodology for multivariate prediction of air pollution. We focus specifically on tackling different forms of missing data, such as: spatial (sparse sites), outcome (pollutants not measured at some sites), and temporal (varieties of interrupted time series). To address these challenges, we develop a novel multivariate fusion framework, which leverages the observed inter-pollutant correlation structure to reduce error in the simultaneous prediction of multiple air pollutants. Our joint fusion model employs predictions from the Environmental Protection Agency's Community Multiscale Air Quality (CMAQ) model along with spatio-temporal error terms. We have implemented our models on both simulated data and a case study in South Carolina for 8 pollutants over a 28-day period in June 2006. We found that our model, which uses a multivariate correlated error in a Bayesian framework, showed promising predictive accuracy particularly for gaseous pollutants.
Keyphrases
  • air pollution
  • data analysis
  • particulate matter
  • heavy metals
  • lung function
  • healthcare
  • mental health
  • public health
  • big data
  • risk assessment
  • cystic fibrosis
  • ionic liquid
  • climate change