Building and analyzing machine learning-based warfarin dose prediction models using scikit-learn.

Published in: Translational and clinical pharmacology (2022)

For personalized drug dosing, prediction models may be utilized to overcome the inter-individual variability. Multiple linear regression has been used as a conventional method to model the relationship between patient features and optimal drug dose. However, linear regression cannot capture non-linear relationships and may be adversely affected by non-normal distribution and collinearity of data. To overcome this hurdle, machine learning models have been extensively adapted in drug dose prediction. In this tutorial, random forest and neural network models will be trained in tandem with a multiple linear regression model on the International Warfarin Pharmacogenetics Consortium dataset using the scikit-learn python library. Subsequent model analyses including performance comparison, permutation feature importance computation and partial dependence plotting will be demonstrated. The basic methods of model training and analysis discussed in this article may be implemented in drug dose-related studies.

Keyphrases