QSAR and Classification Study on Prediction of Acute Oral Toxicity of N-Nitroso Compounds.
Tengjiao FanGuohui SunLijiao ZhaoXin CuiRugang ZhongPublished in: International journal of molecular sciences (2018)
To better understand the mechanism of in vivo toxicity of N-nitroso compounds (NNCs), the toxicity data of 80 NNCs related to their rat acute oral toxicity data (50% lethal dose concentration, LD50) were used to establish quantitative structure-activity relationship (QSAR) and classification models. Quantum chemistry methods calculated descriptors and Dragon descriptors were combined to describe the molecular information of all compounds. Genetic algorithm (GA) and multiple linear regression (MLR) analyses were combined to develop QSAR models. Fingerprints and machine learning methods were used to establish classification models. The quality and predictive performance of all established models were evaluated by internal and external validation techniques. The best GA-MLR-based QSAR model containing eight molecular descriptors was obtained with Q²loo = 0.7533, R² = 0.8071, Q²ext = 0.7041 and R²ext = 0.7195. The results derived from QSAR studies showed that the acute oral toxicity of NNCs mainly depends on three factors, namely, the polarizability, the ionization potential (IP) and the presence/absence and frequency of C⁻O bond. For classification studies, the best model was obtained using the MACCS keys fingerprint combined with artificial neural network (ANN) algorithm. The classification models suggested that several representative substructures, including nitrile, hetero N nonbasic, alkylchloride and amine-containing fragments are main contributors for the high toxicity of NNCs. Overall, the developed QSAR and classification models of the rat acute oral toxicity of NNCs showed satisfying predictive abilities. The results provide an insight into the understanding of the toxicity mechanism of NNCs in vivo, which might be used for a preliminary assessment of NNCs toxicity to mammals.
Keyphrases
- machine learning
- oxidative stress
- deep learning
- structure activity relationship
- molecular docking
- molecular dynamics
- liver failure
- neural network
- big data
- artificial intelligence
- respiratory failure
- drug induced
- electronic health record
- gene expression
- high resolution
- genome wide
- intensive care unit
- risk assessment
- hepatitis b virus
- data analysis