Login / Signup

Multispecies Machine Learning Predictions of In Vitro Intrinsic Clearance with Uncertainty Quantification Analyses.

Raquel Rodríguez-PérezMarkus TrunzerNadine SchneiderBernard FallerGrégori Gerebtzoff
Published in: Molecular pharmaceutics (2022)
In pharmaceutical research, compounds are optimized for metabolic stability to avoid a too fast elimination of the drug. Intrinsic clearance (CL int ) measured in liver microsomes or hepatocytes is an important parameter during lead optimization. In this work, machine learning models were developed to relate the compound structure to microsomal metabolic stability and predict CL int for new compounds. A multitask (MT) learning architecture was introduced to model the CL int of six species simultaneously, giving as a result a multispecies machine learning model. MT graph neural network (MT-GNN) regression was identified as the top-performing method, and an ensemble of 10 MT-GNN models was evaluated prospectively. Geometric mean fold errors were consistently smaller than 2-fold. Moreover, high precision values were obtained in the prediction of "high" (>300 μL/min/mg) and "low" (<100 μL/min/mg) CL int compounds. Precision values ranged from 80 to 94% for low CL int predictions and from 75 to 97% for high CL int predictions, depending on the species. Uncertainty on experimental values and model predictions was systematically quantified. Experimental variability (aleatoric uncertainty) of all historical Novartis in vitro clearance experiments was analyzed. Interestingly, MT-GNN models' performance approached assays' experimental variability. Moreover, uncertainty estimation in predictions (epistemic uncertainty) enabled identifying predictions associated with lower and higher error. Taken together, our manuscript combines a multispecies deep learning model and large-scale uncertainty analyses to improve CL int predictions and facilitate early informed decisions for compound prioritization.
Keyphrases
  • machine learning
  • neural network
  • deep learning
  • convolutional neural network
  • emergency department
  • high throughput
  • electronic health record
  • genetic diversity