Login / Signup

Ensemble-SINDy: Robust sparse model discovery in the low-data, high-noise limit, with active learning and control.

U FaselJ Nathan KutzBingni W BruntonSteven L Brunton
Published in: Proceedings. Mathematical, physical, and engineering sciences (2022)
Sparse model identification enables the discovery of nonlinear dynamical systems purely from data; however, this approach is sensitive to noise, especially in the low-data limit. In this work, we leverage the statistical approach of bootstrap aggregating (bagging) to robustify the sparse identification of the nonlinear dynamics (SINDy) algorithm. First, an ensemble of SINDy models is identified from subsets of limited and noisy data. The aggregate model statistics are then used to produce inclusion probabilities of the candidate functions, which enables uncertainty quantification and probabilistic forecasts. We apply this ensemble-SINDy (E-SINDy) algorithm to several synthetic and real-world datasets and demonstrate substantial improvements to the accuracy and robustness of model discovery from extremely noisy and limited data. For example, E-SINDy uncovers partial differential equations models from data with more than twice as much measurement noise as has been previously reported. Similarly, E-SINDy learns the Lotka Volterra dynamics from remarkably limited data of yearly lynx and hare pelts collected from 1900 to 1920. E-SINDy is computationally efficient, with similar scaling as standard SINDy. Finally, we show that ensemble statistics from E-SINDy can be exploited for active learning and improved model predictive control.
Keyphrases
  • electronic health record
  • big data
  • neural network
  • small molecule
  • machine learning
  • air pollution
  • artificial intelligence
  • high throughput
  • data analysis
  • rna seq
  • single cell