Modeling Small-Molecule Reactivity Identifies Promiscuous Bioactive Compounds.

Matthew K MatlockTyler B Hughes Jayme L Dahlin S Joshua Swamidass

Published in: Journal of chemical information and modeling (2018)

Scientists rely on high-throughput screening tools to identify promising small-molecule compounds for the development of biochemical probes and drugs. This study focuses on the identification of promiscuous bioactive compounds, which are compounds that appear active in many high-throughput screening experiments against diverse targets but are often false-positives which may not be easily developed into successful probes. These compounds can exhibit bioactivity due to nonspecific, intractable mechanisms of action and/or by interference with specific assay technology readouts. Such "frequent hitters" are now commonly identified using substructure filters, including pan assay interference compounds (PAINS). Herein, we show that mechanistic modeling of small-molecule reactivity using deep learning can improve upon PAINS filters when modeling promiscuous bioactivity in PubChem assays. Without training on high-throughput screening data, a deep learning model of small-molecule reactivity achieves a sensitivity and specificity of 18.5% and 95.5%, respectively, in identifying promiscuous bioactive compounds. This performance is similar to PAINS filters, which achieve a sensitivity of 20.3% at the same specificity. Importantly, such reactivity modeling is complementary to PAINS filters. When PAINS filters and reactivity models are combined, the resulting model outperforms either method alone, achieving a sensitivity of 24% at the same specificity. However, as a probabilistic model, the sensitivity and specificity of the deep learning model can be tuned by adjusting the threshold. Moreover, for a subset of PAINS filters, this reactivity model can help discriminate between promiscuous and nonpromiscuous bioactive compounds even among compounds matching those filters. Critically, the reactivity model provides mechanistic hypotheses for assay interference by predicting the precise atoms involved in compound reactivity. Overall, our analysis suggests that deep learning approaches to modeling promiscuous compound bioactivity may provide a complementary approach to current methods for identifying promiscuous compounds.

Keyphrases