Login / Signup

An illustration of model agnostic explainability methods applied to environmental data.

Christopher K WikleAbhirup DattaBhava Vyasa HariEdward L BooneIndranil SahooIndulekha KavilaStefano CastruccioSusan J SimmonsWesley S BurrWon Chang
Published in: Environmetrics (2022)
Historically, two primary criticisms statisticians have of machine learning and deep neural models is their lack of uncertainty quantification and the inability to do inference (i.e., to explain what inputs are important). Explainable AI has developed in the last few years as a sub-discipline of computer science and machine learning to mitigate these concerns (as well as concerns of fairness and transparency in deep modeling). In this article, our focus is on explaining which inputs are important in models for predicting environmental data. In particular, we focus on three general methods for explainability that are model agnostic and thus applicable across a breadth of models without internal explainability: "feature shuffling", "interpretable local surrogates", and "occlusion analysis". We describe particular implementations of each of these and illustrate their use with a variety of models, all applied to the problem of long-lead forecasting monthly soil moisture in the North American corn belt given sea surface temperature anomalies in the Pacific Ocean.
Keyphrases
  • machine learning
  • big data
  • artificial intelligence
  • deep learning
  • electronic health record
  • public health
  • risk assessment
  • single cell
  • data analysis
  • life cycle