Login / Signup

Novel Application of Machine Learning Algorithms and Model-Agnostic Methods to Identify Factors Influencing Childhood Blood Lead Levels.

Xiaochi LiuMark Patrick TaylorC Marjorie AelionChenyin Dong
Published in: Environmental science & technology (2021)
Blood lead (Pb) poisoning remains a global concern, particularly for children in their early developmental years. Broken Hill is Australia's oldest operating silver-zinc-lead mine. In this study, we utilized recent advances in machine learning to assess multiple algorithms and identify the most optimal model for predicting childhood blood Pb levels (BLL) using Broken Hill children's (<5 years of age) data (n = 23,749) from 1991 to 2015, combined with demographic, socio-economic, and environmental influencing factors. We applied model-agnostic methods to interpret the most optimal model, investigating different environmental and human factors influencing childhood BLL. Algorithm assessment showed that stacked ensemble, a method for automatically and optimally combining multiple prediction algorithms, enhanced predictive performance by 1.1% with respect to mean absolute error (p < 0.01) and 2.6% for root-mean-squared error (p < 0.01) compared to the best performing constituent algorithm (random forest). By interpreting the model, the following information was acquired: children had higher BLL if they resided within 1.0 km to the central mine area or 1.37 km to the railroad; year of testing had the greatest interactive strength with all other factors; BLL increased faster in Aboriginal than in non-Aboriginal children at 9-10 and 12-18 months of age. This "stacked ensemble + model-agnostic interpretation" framework achieved both prediction accuracy and model interpretability, identifying previously unconnected variables associated with elevated childhood BLL, offering a marked advantage over previous works. Thus, this approach has a clear value and potential for application to other environmental health issues.
Keyphrases
  • machine learning
  • deep learning
  • young adults
  • public health
  • healthcare
  • heavy metals
  • human health
  • endothelial cells
  • gold nanoparticles
  • mental health
  • electronic health record
  • risk assessment