Login / Signup

Active learning in Gaussian process interpolation of potential energy surfaces.

Elena UtevaRichard J WheatleyRichard D WilkinsonRichard J Wheatley
Published in: The Journal of chemical physics (2018)
Three active learning schemes are used to generate training data for Gaussian process interpolation of intermolecular potential energy surfaces. These schemes aim to achieve the lowest predictive error using the fewest points and therefore act as an alternative to the status quo methods involving grid-based sampling or space-filling designs like Latin hypercubes (LHC). Results are presented for three molecular systems: CO2-Ne, CO2-H2, and Ar3. For each system, two of the active learning schemes proposed notably outperform LHC designs of comparable size, and in two of the systems, produce an error value an order of magnitude lower than the one produced by the LHC method. The procedures can be used to select a subset of points from a large pre-existing data set, to select points to generate data de novo, or to supplement an existing data set to improve accuracy.
Keyphrases
  • electronic health record
  • big data
  • risk assessment
  • data analysis
  • artificial intelligence
  • cystic fibrosis
  • pseudomonas aeruginosa
  • climate change
  • virtual reality
  • candida albicans