Login / Signup

Predicting Monthly Community-Level Domestic Radon Concentrations in the Greater Boston Area with an Ensemble Learning Model.

Longxiang LiAnnelise J BlombergRebeca Ariel SternChoong-Min KangStefania PapatheodorouYaguang WeiMan LiuAdjani A PeraltaCarolina L Z VieiraPetros Koutrakis
Published in: Environmental science & technology (2021)
Inhaling radon and its progeny is associated with adverse health outcomes. However, previous studies of the health effects of residential exposure to radon in the United States were commonly based on a county-level temporally invariant radon model that was developed using measurements collected in the mid- to late 1980s. We developed a machine learning model to predict monthly radon concentrations for each ZIP Code Tabulation Area (ZCTA) in the Greater Boston area based on 363,783 short-term measurements by Spruce Environmental Technologies, Inc., during the period 2005-2018. A two-stage ensemble-based model was developed to predict radon concentrations for all ZCTAs and months. Stage one included 12 base statistical models that independently predicted ZCTA-level radon concentrations based on geological, architectural, socioeconomic, and meteorological factors for each ZCTA. Stage two aggregated the predictions of these 12 base models using an ensemble learning method. The results of a 10-fold cross-validation showed that the stage-two model has a good prediction accuracy with a weighted R2 of 0.63 and root mean square error of 22.6 Bq/m3. The community-level time-varying predictions from our model have good predictive precision and accuracy and can be used in future prospective epidemiological studies in the Greater Boston area.
Keyphrases