Toxic Colors: The Use of Deep Learning for Predicting Toxicity of Compounds Merely from Their Graphic Images.
Michael FernandezFuqiang BanGodwin WooMichael HsingTakeshi YamazakiEric LeBlancPaul S RennieWilliam J WelchArtem CherkasovPublished in: Journal of chemical information and modeling (2018)
The majority of computational methods for predicting toxicity of chemicals are typically based on "nonmechanistic" cheminformatics solutions, relying on an arsenal of QSAR descriptors, often vaguely associated with chemical structures, and typically employing "black-box" mathematical algorithms. Nonetheless, such machine learning models, while having lower generalization capacity and interpretability, typically achieve a very high accuracy in predicting various toxicity endpoints, as unambiguously reflected by the results of the recent Tox21 competition. In the current study, we capitalize on the power of modern AI to predict Tox21 benchmark data using merely simple 2D drawings of chemicals, without employing any chemical descriptors. In particular, we have processed rather trivial 2D sketches of molecules with a supervised 2D convolutional neural network (2DConvNet) and demonstrated that the modern image recognition technology results in prediction accuracies comparable to the state-of-the-art cheminformatics tools. Furthermore, the performance of the image-based 2DConvNet model was comparatively evaluated on an external set of compounds from the Prestwick chemical library and resulted in experimental identification of significant and previously unreported antiandrogen potentials for several well-established generic drugs.