cando.py: Open Source Software for Predictive Bioanalytics of Large Scale Drug-Protein-Disease Data.

William MangioneZackary FallsGaurav ChopraRam Samudrala

Published in: Journal of chemical information and modeling (2020)

Traditional drug discovery methods focus on optimizing the efficacy of a drug against a single biological target of interest for a specific disease. However, evidence supports the multitarget theory, i.e., drugs work by exerting their therapeutic effects via interaction with multiple biological targets, which have multiple phenotypic effects. Analytics of drug-protein interactions on a large proteomic scale provides insight into disease systems while also allowing for prediction of putative therapeutics against specific indications. We present a Python package for analysis of drug-proteome and drug-disease relationships implementing the Computational Analysis of Novel Drug Opportunities (CANDO) platform. The CANDO package allows for rapid drug similarity assessment, most notably via an in-house interaction scoring protocol where billions of drug-protein interactions are rapidly scored and the similarity of drug-proteome interaction signatures is calculated. The package also implements a variety of benchmarking protocols for shotgun drug discovery and repurposing, i.e., to determine how every known drug is related to every other in the context of the indications/diseases for which they are approved. Drug predictions are generated through consensus scoring of the most similar compounds to drugs known to treat a particular indication. Support for comparing and ranking novel chemical entities, as well as machine learning modules for both benchmarking and putative drug candidate prediction is also available. The CANDO Python package is available on GitHub at https://github.com/ram-compbio/CANDO, through the Conda Python package installer, and at http://compbio.org/software/.

Keyphrases