Predicting Drug-Induced Liver Injury with Bayesian Machine Learning.
Dominic P WilliamsStanley E LazicAlison J FosterElizaveta SemenovaPaul MorganPublished in: Chemical research in toxicology (2019)
Drug induced liver injury (DILI) can require significant risk management in drug development and on occasion can cause morbidity or mortality, leading to drug attrition. Optimizing candidates preclinically can minimize hepatotoxicity risk, but it is difficult to predict due to multiple etiologies encompassing DILI, often with multifactorial and overlapping mechanisms. In addition to epidemiological risk factors, physicochemical properties, dose, disposition, lipophilicity, and hepatic metabolic function are also relevant for DILI risk. Better human-relevant, predictive models are required to improve hepatotoxicity risk assessment in drug discovery. Our hypothesis is that integrating mechanistically relevant hepatic safety assays with Bayesian machine learning will improve hepatic safety risk prediction. We present a quantitative and mechanistic risk assessment for candidate nomination using data from in vitro assays (hepatic spheroids, BSEP, mitochondrial toxicity, and bioactivation), together with physicochemical (cLogP) and exposure (Cmaxtotal) variables from a chemically diverse compound set (33 no/low-, 40 medium-, and 23 high-severity DILI compounds). The Bayesian model predicts the continuous underlying DILI severity and uses a data-driven prior distribution over the parameters to prevent overfitting. The model quantifies the probability that a compound falls into either no/low-, medium-, or high-severity categories, with a balanced accuracy of 63% on held-out samples, and a continuous prediction of DILI severity along with uncertainty in the prediction. For a binary yes/no DILI prediction, the model has a balanced accuracy of 86%, a sensitivity of 87%, a specificity of 85%, a positive predictive value of 92%, and a negative predictive value of 78%. Combining physiologically relevant assays, improved alignment with FDA recommendations, and optimal statistical integration of assay data leads to improved DILI risk prediction.
Keyphrases
- drug induced
- machine learning
- risk assessment
- risk factors
- high throughput
- adverse drug
- drug discovery
- big data
- endothelial cells
- oxidative stress
- artificial intelligence
- electronic health record
- heavy metals
- human health
- cardiovascular events
- emergency department
- high resolution
- coronary artery disease
- type diabetes
- deep learning
- mass spectrometry
- data analysis