Login / Signup

Machine-Learning Predictions of Critical Temperatures from Chemical Compositions of Superconductors.

Son Gyo JungGuwon JungJacqueline M Cole
Published in: Journal of chemical information and modeling (2024)
In the quest for advanced superconducting materials, the accurate prediction of critical temperatures ( T c ) poses a formidable challenge, largely due to the complex interdependencies between superconducting properties and the chemical and structural characteristics of a given material. To address this challenges, we have developed a machine-learning framework that aims to elucidate these complicated and hitherto poorly understood structure-property and property-property relationships. This study introduces a novel machine-learning-based workflow, termed the Gradient Boosted Feature Selection (GBFS), which has been tailored to predict T c for superconductors by employing a distributed gradient-boosting framework. This approach integrates exploratory data analyses, statistical evaluations, and multicollinearity reduction techniques to select highly relevant features from a high-dimensional feature space, derived solely from the chemical composition of materials. Our methodology was rigorously tested on a data set comprising approximately 16,400 chemical compounds with around 12,000 unique chemical compositions. The GBFS workflow enabled the development of a classification model that distinguishes compositions likely to exhibit T c values greater than 10 K. This model achieved a weighted average F1-score of 0.912, an AUC-ROC of 0.986, and an average precision score of 0.919. Additionally, the GBFS workflow underpinned a regression model that predicted T c values with an R 2 of 0.945, an MAE of 3.54 K, and an RMSE of 6.57 K on a test set obtained via random splitting. Further exploration was conducted through out-of-sample T c predictions, particularly those exceeding the liquid nitrogen temperature, and out-of-distribution predictions for (Ca 1- x La x )FeAs 2 based on varying lanthanum content. The outcome of our study underscores the significance of systematic feature analysis and selection in enhancing predictive model performance, offering various advantages over models that rely primarily on algorithmic complexity. This research not only advances the field of superconductivity but also sets a precedent for the application of machine learning in materials science.
Keyphrases
  • machine learning
  • big data
  • artificial intelligence
  • deep learning
  • electronic health record
  • public health
  • magnetic resonance
  • computed tomography
  • neural network
  • ionic liquid