Login / Signup

Upgrades of Genetic Programming for Data Driven Modelling of Time Series.

A MurariE PelusoL SpolladoreR RossiM Gelfusa
Published in: Evolutionary computation (2023)
In many engineering fields and scientific disciplines, the results of experiments are in the form of time series, which can be quite problematic to interpret and model. Genetic programming tools are quite powerful in extracting knowledge from data. In this work, several upgrades and refinements are proposed and tested to improve the explorative capabilities of Symbolic Regression (SR) via Genetic Programming (GP) for the investigation of time series, with the objective of extracting mathematical models directly from the available signals. The main task is not simply prediction but consists of identifying interpretable equations, reflecting the nature of the mechanisms generating the signals. The implemented improvements involve almost all aspects of GP, from the knowledge representation and the genetic operators to the fitness function. The unique capabilities of genetic programming, to accommodate prior information and knowledge, are also leveraged effectively. The proposed upgrades cover the most important applications of empirical modelling of time series, ranging from the identification of autoregressive systems and partial differential equations to the search of models in terms of dimensionless quantities and appropriate physical units. Particularly delicate systems to identify, such as those showing hysteretic behaviour or governed by delayed differential equations, are also addressed. The potential of the developed tools is substantiated with both a battery of systematic numerical tests with synthetic signals and with applications to experimental data.
Keyphrases
  • genome wide
  • healthcare
  • physical activity
  • copy number
  • mental health
  • electronic health record
  • big data
  • risk assessment
  • gene expression
  • artificial intelligence
  • dna methylation