Login / Signup

SMiPoly: Generation of a Synthesizable Polymer Virtual Library Using Rule-Based Polymerization Reactions.

Mitsuru OhnoYoshihiro HayashiQi ZhangYu KanekoRyo Yoshida
Published in: Journal of chemical information and modeling (2023)
Recent advances in machine learning have led to the rapid adoption of various computational methods for de novo molecular design in polymer research, including high-throughput virtual screening and inverse molecular design. In such workflows, molecular generators play an essential role in creation or sequential modification of candidate polymer structures. Machine learning-assisted molecular design has made great technical progress over the past few years. However, the difficulty of identifying synthetic routes to such designed polymers remains unresolved. To address this technical limitation, we present Small Molecules into Polymers (SMiPoly), a Python library for virtual polymer generation that implements 22 chemical rules for commonly applied polymerization reactions. For given small organic molecules to form a candidate monomer set, the SMiPoly generator conducts possible polymerization reactions to generate an exhaustive list of potentially synthesizable polymers. In this study, using 1083 readily available monomers, we generated 169,347 unique polymers forming seven different molecular types: polyolefin, polyester, polyether, polyamide, polyimide, polyurethane, and polyoxazolidone. By comparing the distribution of the virtually created polymers with approximately 16,000 real polymers synthesized so far, it was found that the coverage and novelty of the SMiPoly-generated polymers can reach 48 and 53%, respectively. Incorporating the SMiPoly library into a molecular design workflow will accelerate the process of de novo polymer synthesis by shortening the step to select synthesizable candidate polymers.
Keyphrases
  • machine learning
  • high throughput
  • single molecule
  • healthcare
  • artificial intelligence
  • high resolution
  • mass spectrometry
  • deep learning
  • big data
  • solid phase extraction
  • loop mediated isothermal amplification