Efficient Hit-to-Lead Searching of Kinase Inhibitor Chemical Space via Computational Fragment Merging.

Grigorii V Andrianov Wern Juin Gabriel OngIlya SerebriiskiiJohn Karanicolas

Published in: Journal of chemical information and modeling (2021)

In early-stage drug discovery, the hit-to-lead optimization (or "hit expansion") stage entails starting from a newly identified active compound and improving its potency or other properties. Traditionally, this process relies on synthesizing and evaluating a series of analogues to build up structure-activity relationships. Here, we describe a computational strategy focused on kinase inhibitors, intended to expedite the process of identifying analogues with improved potency. Our protocol begins from an inhibitor of the target kinase and generalizes the synthetic route used to access it. By searching for commercially available replacements for the individual building blocks used to make the parent inhibitor, we compile an enumerated library of compounds that can be accessed using the same chemical transformations; these huge libraries can exceed many millions─or billions─of compounds. Because the resulting libraries are much too large for explicit virtual screening, we instead consider alternate approaches to identify the top-scoring compounds. We find that contributions from individual substituents are well described by a pairwise additivity approximation, provided that the corresponding fragments position their shared core in precisely the same way relative to the binding site. This key insight allows us to determine which fragments are suitable for merging into single new compounds and which are not. Further, the use of pairwise approximation allows interaction energies to be assigned to each compound in the library without the need for any further structure-based modeling: interaction energies instead can be reliably estimated from the energies of the component fragments, and the reduced computational requirements allow for flexible energy minimizations that allow the kinase to respond to each substitution. We demonstrate this protocol using libraries built from six representative kinase inhibitors drawn from the literature, which target five different kinases: CDK9, CHK1, CDK2, EGFRT790M, and ACK1. In each example, the enumerated library includes additional analogues reported by the original study to have activity, and these analogues are successfully prioritized within the library. We envision that the insights from this work can facilitate the rapid assembly and screening of increasingly large libraries for focused hit-to-lead optimization. To enable adoption of these methods and to encourage further analyses, we disseminate the computational tools needed to deploy this protocol.

Keyphrases