Profiling Prediction of Kinase Inhibitors: Toward the Virtual Assay.
Benjamin MergetSamo TurkSameh EidFriedrich RippmannSimone FullePublished in: Journal of medicinal chemistry (2016)
Kinome-wide screening would have the advantage of providing structure-activity relationships against hundreds of targets simultaneously. Here, we report the generation of ligand-based activity prediction models for over 280 kinases by employing Machine Learning methods on an extensive data set of proprietary bioactivity data combined with open data. High quality (AUC > 0.7) was achieved for ∼200 kinases by (1) combining open with proprietary data, (2) choosing Random Forest over alternative tested Machine Learning methods, and (3) balancing the training data sets. Tests on left-out and external data indicate a high value for virtual screening projects. Importantly, the derived models are evenly distributed across the kinome tree, allowing reliable profiling prediction for all kinase branches. The prediction quality was further improved by employing experimental bioactivity fingerprints of a small kinase subset. Overall, the generated models can support various hit identification tasks, including virtual screening, compound repurposing, and the detection of potential off-targets.