Login / Signup

PREFER: A New Predictive Modeling Framework for Molecular Discovery.

Jessica LaniniGianluca SantarossaFinton SirockinRichard A LewisNikolas FechnerHubert MisztelaSarah LewisKrzysztof MaziarzMegan StanleyMarwin H S SeglerNikolaus StieflNadine Schneider
Published in: Journal of chemical information and modeling (2023)
Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.7.7) and based on AutoSklearn (version 0.14.7), that allows comparison between different molecular representations and common machine-learning models. We provide an overview of the design of our framework and show exemplary use cases and results of several representation-model combinations on diverse data sets, both public and in-house. Finally, we discuss the use of PREFER on small data sets. The code of the framework is freely available on GitHub.
Keyphrases
  • machine learning
  • deep learning
  • big data
  • healthcare
  • single molecule
  • working memory
  • artificial intelligence
  • electronic health record
  • emergency department
  • mental health