Multifidelity Information Fusion with Machine Learning: A Case Study of Dopant Formation Energies in Hafnia.
Rohit BatraGhanshyam PilaniaBlas Pedro UberuagaRampi RamprasadPublished in: ACS applied materials & interfaces (2019)
Cost versus accuracy trade-offs are frequently encountered in materials science and engineering, where a particular property of interest can be measured/computed at different levels of accuracy or fidelity. Naturally, the most accurate measurement is also the most resource and time intensive, while the inexpensive quicker alternatives tend to be noisy. In such situations, a number of machine learning (ML) based multifidelity information fusion (MFIF) strategies can be employed to fuse information accessible from varying sources of fidelity and make predictions at the highest level of accuracy. In this work, we perform a comparative study on traditionally employed single-fidelity and three MFIF strategies, namely, (1) Δ-learning, (2) low-fidelity as a feature, and (3) multifidelity cokriging (CK) to compare their relative prediction accuracies and efficiencies for accelerated property predictions and high throughput chemical space explorations. We perform our analysis using a dopant formation energy data set for hafnia, which is a well-known high-k material and is being extensively studied for its promising ferroelectric, piezoelectric, and pyroelectric properties. We use a dopant formation energy data set of 42 dopants in hafnia-each studied in six different hafnia phases-computed at two levels of fidelities to find merits and limitations of these ML strategies. The findings of this work indicate that the MFIF based learning schemes outperform the traditional SF machine learning methods, such as Gaussian process regression and CK provides an accurate, inexpensive and flexible alternative to other MFIF strategies. While the results presented here are for the case study of hafnia, they are expected to be general. Therefore, materials discovery problems that involve huge chemical space explorations can be studied efficiently (or even made feasible in some situations) through a combination of a large number of low-fidelity and a few high-fidelity measurements/computations, in conjunction with the CK approach.