Enhancing hERG Risk Assessment with Interpretable Classificatory and Regression Models.
Igor H SanchesRodolpho Campos BragaVinicius M AlvesCarolina Horta AndradePublished in: Chemical research in toxicology (2024)
The human Ether-à-go-go-Related Gene (hERG) is a transmembrane protein that regulates cardiac action potential, and its inhibition can induce a potentially deadly cardiac syndrome. In vitro tests help identify hERG blockers at early stages; however, the high cost motivates searching for alternative, cost-effective methods. The primary goal of this study was to enhance the Pred-hERG tool for predicting hERG blockage. To achieve this, we developed new QSAR models that incorporated additional data, updated existing classificatory and multiclassificatory models, and introduced new regression models. Notably, we integrated SHAP (SHapley Additive exPlanations) values to offer a visual interpretation of these models. Utilizing the latest data from ChEMBL v30, encompassing over 14,364 compounds with hERG data, our binary and multiclassification models outperformed both the previous iteration of Pred-hERG and all publicly available models. Notably, the new version of our tool introduces a regression model for predicting hERG activity (pIC50). The optimal model demonstrated an R 2 of 0.61 and an RMSE of 0.48, surpassing the only available regression model in the literature. Pred-hERG 5.0 now offers users a swift, reliable, and user-friendly platform for the early assessment of chemically induced cardiotoxicity through hERG blockage. The tool provides versatile outcomes, including (i) classificatory predictions of hERG blockage with prediction reliability, (ii) multiclassificatory predictions of hERG blockage with reliability, (iii) regression predictions with estimated pIC 50 values, and (iv) probability maps illustrating the contribution of chemical fragments for each prediction. Furthermore, we implemented explainable AI analysis (XAI) to visualize SHAP values, providing insights into the contribution of each feature to binary classification predictions. A consensus prediction calculated based on the predictions of the three developed models is also present to assist the user's decision-making process. Pred-hERG 5.0 has been designed to be user-friendly, making it accessible to users without computational or programming expertise. The tool is freely available at http://predherg.labmol.com.br.
Keyphrases
- decision making
- left ventricular
- big data
- electronic health record
- deep learning
- systematic review
- genome wide
- gene expression
- endothelial cells
- small molecule
- dna methylation
- oxidative stress
- data analysis
- single cell
- angiotensin ii
- induced pluripotent stem cells
- human health
- case report
- weight loss
- climate change
- angiotensin converting enzyme