Enhancing hERG Risk Assessment with Interpretable Classificatory and Regression Models.
Igor H SanchesRodolpho Campos BragaVinicius M AlvesCarolina Horta AndradePublished in: Chemical research in toxicology (2024)
The human Ether-à-go-go-Related Gene (hERG) is a transmembrane protein that regulates cardiac action potential, and its inhibition can induce a potentially deadly cardiac syndrome. In vitro tests help identify hERG blockers at early stages; however, the high cost motivates searching for alternative, cost-effective methods. The primary goal of this study was to enhance the Pred-hERG tool for predicting hERG blockage. To achieve this, we developed new QSAR models that incorporated additional data, updated existing classificatory and multiclassificatory models, and introduced new regression models. Notably, we integrated SHAP (SHapley Additive exPlanations) values to offer a visual interpretation of these models. Utilizing the latest data from ChEMBL v30, encompassing over 14,364 compounds with hERG data, our binary and multiclassification models outperformed both the previous iteration of Pred-hERG and all publicly available models. Notably, the new version of our tool introduces a regression model for predicting hERG activity (pIC50). The optimal model demonstrated an R 2 of 0.61 and an RMSE of 0.48, surpassing the only available regression model in the literature. Pred-hERG 5.0 now offers users a swift, reliable, and user-friendly platform for the early assessment of chemically induced cardiotoxicity through hERG blockage. The tool provides versatile outcomes, including (i) classificatory predictions of hERG blockage with prediction reliability, (ii) multiclassificatory predictions of hERG blockage with reliability, (iii) regression predictions with estimated pIC 50 values, and (iv) probability maps illustrating the contribution of chemical fragments for each prediction. Furthermore, we implemented explainable AI analysis (XAI) to visualize SHAP values, providing insights into the contribution of each feature to binary classification predictions. A consensus prediction calculated based on the predictions of the three developed models is also present to assist the user's decision-making process. Pred-hERG 5.0 has been designed to be user-friendly, making it accessible to users without computational or programming expertise. The tool is freely available at http://predherg.labmol.com.br.
Keyphrases
- risk assessment
- systematic review
- machine learning
- decision making
- endothelial cells
- metabolic syndrome
- type diabetes
- heart failure
- molecular docking
- heavy metals
- single cell
- insulin resistance
- climate change
- data analysis
- gene expression
- high throughput
- human health
- oxidative stress
- high glucose
- dna methylation
- pluripotent stem cells
- stress induced