Development of GBRT Model as a Novel and Robust Mathematical Model to Predict and Optimize the Solubility of Decitabine as an Anti-Cancer Drug.
Walid Kamal AbdelbassetShereen H ElsayedSameer AlshehriBader I HuwaimelAhmed AlobaidaAmal M AlsubaiyelAbdulsalam A AlqahtaniMohamed A El HamdKumar VenkatesanKareem M AboRasMohammad A S AbourehabPublished in: Molecules (Basel, Switzerland) (2022)
The efficient production of solid-dosage oral formulations using eco-friendly supercritical solvents is known as a breakthrough technology towards developing cost-effective therapeutic drugs. Drug solubility is a significant parameter which must be measured before designing the process. Decitabine belongs to the antimetabolite class of chemotherapy agents applied for the treatment of patients with myelodysplastic syndrome (MDS). In recent years, the prediction of drug solubility by applying mathematical models through artificial intelligence (AI) has become known as an interesting topic due to the high cost of experimental investigations. The purpose of this study is to develop various machine-learning-based models to estimate the optimum solubility of the anti-cancer drug decitabine, to evaluate the effects of pressure and temperature on it. To make models on a small dataset in this research, we used three ensemble methods, Random Forest (RFR), Extra Tree (ETR), and Gradient Boosted Regression Trees (GBRT). Different configurations were tested, and optimal hyper-parameters were found. Then, the final models were assessed using standard metrics. RFR, ETR, and GBRT had R2 scores of 0.925, 0.999, and 0.999, respectively. Furthermore, the MAPE metric error rates were 1.423 × 10 -1 7.573 × 10 -2 , and 7.119 × 10 -2 , respectively. According to these facts, GBRT was considered as the primary model in this paper. Using this method, the optimal amounts are calculated as: P = 380.88 bar, T = 333.01 K, Y = 0.001073.