Machine Learning-based Integration of Network Features and Chemical Structure of Compounds for SARS-CoV-2 Drug Effect Analysis.
Julian SpäthRui-Sheng WangMaeve HumphreyJan BaumbachJoseph LoscalzoPublished in: CPT: pharmacometrics & systems pharmacology (2023)
High drug development costs and the limited number of new annual drug approvals increase the need for innovative approaches for drug effect prediction. Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), led to a global pandemic with high morbidity and mortality. While effective preventive measures exist, there are few effective treatments for hospitalized patients with SARS-CoV-2 infection. Drug effect prediction are promising strategies that could shorten development time and reduce costs compared to de novo drug discovery. In this work, we present a machine learning framework to integrate a variety of target network features and physicochemical properties of compounds and analyze their influence on the therapeutic effects for SARS-CoV-2 infection and on host cell cytotoxic effects. The random forest models trained on compounds with known experimental effects on SARS-CoV-2 infection and subsequent feature importance analysis based on Shapely values provided insights into the determinants of drug efficacy and cytotoxicity, which can be incorporated into novel drug discovery approaches. Given the complexity of molecular mechanisms of drug action and limited sample sizes, our models achieve a reasonable mean ROC-AUC of 0.73 on our unseen validation set. To our knowledge, this is the first work to incorporate a combination of network and physicochemical features of compounds into a machine learning model to predict drug effects on SARS-CoV-2 infection. Our systems pharmacology-based machine learning framework can be used to classify other existing drugs for SARS-CoV-2 infection and can easily be adapted to drug effect prediction for future viral outbreaks.