Entropy-based active learning of graph neural network surrogate models for materials properties.
Johannes AlloteyKeith T ButlerJeyan ThiyagalingamPublished in: The Journal of chemical physics (2021)
Graph neural networks trained on experimental or calculated data are becoming an increasingly important tool in computational materials science. Networks once trained are able to make highly accurate predictions at a fraction of the cost of experiments or first-principles calculations of comparable accuracy. However, these networks typically rely on large databases of labeled experiments to train the model. In scenarios where data are scarce or expensive to obtain, this can be prohibitive. By building a neural network that provides confidence on the predicted properties, we are able to develop an active learning scheme that can reduce the amount of labeled data required by identifying the areas of chemical space where the model is most uncertain. We present a scheme for coupling a graph neural network with a Gaussian process to featurize solid-state materials and predict properties including a measure of confidence in the prediction. We then demonstrate that this scheme can be used in an active learning context to speed up the training of the model by selecting the optimal next experiment for obtaining a data label. Our active learning scheme can double the rate at which the performance of the model on a test dataset improves with additional data compared to choosing the next sample at random. This type of uncertainty quantification and active learning has the potential to open up new areas of materials science, where data are scarce and expensive to obtain, to the transformative power of graph neural networks.