A Machine Learning Framework to Improve Rat Clearance Predictions and Inform Physiologically Based Pharmacokinetic Modeling.

Andrea Andrews-MorgerMichael ReutlingerNeil John Parrott Andres M Olivares-Morales

Published in: Molecular pharmaceutics (2023)

During drug discovery and development, achieving appropriate pharmacokinetics is key to establishment of the efficacy and safety of new drugs. Physiologically based pharmacokinetic (PBPK) models integrating in vitro -to- in vivo extrapolation have become an essential in silico tool to achieve this goal. In this context, the most important and probably most challenging pharmacokinetic parameter to estimate is the clearance. Recent work on high-throughput PBPK modeling during drug discovery has shown that a good estimate of the unbound intrinsic clearance (CL int,u, ) is the key factor for useful PBPK application. In this work, three different machine learning-based strategies were explored to predict the rat CL int,u as the input into PBPK. Therefore, in vivo and in vitro data was collected for a total of 2639 proprietary compounds. The strategies were compared to the standard in vitro bottom-up approach. Using the well-stirred liver model to back-calculate in vivo CL int,u from in vivo rat clearance and then training a machine learning model on this CL int,u led to more accurate clearance predictions (absolute average fold error (AAFE) 3.1 in temporal cross-validation) than the bottom-up approach (AAFE 3.6-16, depending on the scaling method) and has the advantage that no experimental in vitro data is needed. However, building a machine learning model on the bias between the back-calculated in vivo CL int,u and the bottom-up scaled in vitro CL int,u also performed well. For example, using unbound hepatocyte scaling, adding the bias prediction improved the AAFE in the temporal cross-validation from 16 for bottom-up to 2.9 together with the bias prediction. Similarly, the log Pearson r 2 improved from 0.1 to 0.29. Although it would still require in vitro measurement of CL int,u. , using unbound scaling for the bottom-up approach, the need for correction of the f u,inc by f u,p data is circumvented. While the above-described ML models were built on all data points available per approach, it is discussed that evaluation comparison across all approaches could only be performed on a subset because ca. 75% of the molecules had missing or unquantifiable measurements of the fraction unbound in plasma or in vitro unbound intrinsic clearance, or they dropped out due to the blood-flow limitation assumed by the well-stirred model. Advantageously, by predicting CL int,u as the input into PBPK, existing workflows can be reused and the prediction of the in vivo clearance and other PK parameters can be improved.

Keyphrases