Computational prediction of Lee retention indices of polycyclic aromatic hydrocarbons by using machine learning.
Linkang SunMin ZhangLiangxu XieXiaojun XuPeng XuLei XuPublished in: Chemical biology & drug design (2022)
Given the difficult of experimental determination, quantitative structure-property relationship (QSPR) and deep learning (DL) provide an important tool to predict physicochemical property of chemical compounds. In this paper, partial least squares (PLS), genetic function approximation (GFA), and deep neural network (DNN) were used to predict the Lee retention index (Lee-RI) of PAHs in SE-52 and DB-5 stationary phases. Four molecular descriptors, molecular weight (MW), quantitative estimate of drug-likeness (QED), atomic charge weighted negative surface area (Jurs_PNSA_3), and relative negative charge (Jurs_RNCG) were selected to construct regression models based on genetic algorithm. For SE-52, PLS model showed best prediction power, followed by DNN and GFA. The relative error (RE), root mean square error (RMSE), and regression coefficient (R 2 ) of best PLS regression model are 1.228%, 5.407, and 0.980. For DB-5, DNN model showed best prediction power, followed by GFA and PLS. The RE, RMSE and R 2 of best DNN regression model for DB-5-1 and DB-5-2 are 1.058%, 4.325%, 0.976%, 0.821%, 3.795%, and 0.970%, respectively. The three regression models not only show good predictive ability, but also highlight the stability and ductility of the models.
Keyphrases
- deep learning
- neural network
- machine learning
- high resolution
- magnetic resonance
- dna methylation
- magnetic resonance imaging
- genome wide
- artificial intelligence
- emergency department
- computed tomography
- copy number
- heavy metals
- mass spectrometry
- risk assessment
- climate change
- network analysis
- simultaneous determination
- diffusion weighted imaging
- electron microscopy