Login / Signup

Novel Solubility Prediction Models: Molecular Fingerprints and Physicochemical Features vs Graph Convolutional Neural Networks.

Sumin LeeMyeonghun LeeKi-Won GyakSung Dug KimMi-Jeong KimKyoungmin Min
Published in: ACS omega (2022)
Predicting both accurate and reliable solubility values has long been a crucial but challenging task. In this work, surrogated model-based methods were developed to accurately predict the solubility of two molecules (solute and solvent) through machine learning and deep learning. The current study employed two methods: (1) converting molecules into molecular fingerprints and adding optimal physicochemical properties as descriptors and (2) using graph convolutional network (GCN) models to convert molecules into a graph representation and deal with prediction tasks. Then, two prediction tasks were conducted with each method: (1) the solubility value (regression) and (2) the solubility class (classification). The fingerprint-based method clearly demonstrates that high performance is possible by adding simple but significant physicochemical descriptors to molecular fingerprints, while the GCN method shows that it is possible to predict various properties of chemical compounds with relatively simplified features from the graph representation. The developed methodologies provide a comprehensive understanding of constructing a proper model for predicting solubility and can be employed to find suitable solutes and solvents.
Keyphrases
  • convolutional neural network
  • deep learning
  • machine learning
  • neural network
  • artificial intelligence
  • working memory
  • ionic liquid
  • single molecule
  • high resolution
  • big data
  • mass spectrometry
  • quality control