In Silico HCT116 Human Colon Cancer Cell-Based Models En Route to the Discovery of Lead-Like Anticancer Drugs.
Sara CruzSofia E GomesPedro M BorralhoCecília M P RodriguesSusana P GaudêncioFlorbela PereiraPublished in: Biomolecules (2018)
To discover new inhibitors against the human colon carcinoma HCT116 cell line, two quantitative structure⁻activity relationship (QSAR) studies using molecular and nuclear magnetic resonance (NMR) descriptors were developed through exploration of machine learning techniques and using the value of half maximal inhibitory concentration (IC50). In the first approach, A, regression models were developed using a total of 7339 molecules that were extracted from the ChEMBL and ZINC databases and recent literature. The performance of the regression models was successfully evaluated by internal and external validations, the best model achieved R² of 0.75 and 0.73 and root mean square error (RMSE) of 0.66 and 0.69 for the training and test sets, respectively. With the inherent time-consuming efforts of working with natural products (NPs), we conceived a new NP drug hit discovery strategy that consists in frontloading samples with 1D NMR descriptors to predict compounds with anticancer activity prior to bioactivity screening for NPs discovery, approach B. The NMR QSAR classification models were built using 1D NMR data (¹H and 13C) as descriptors, from 50 crude extracts, 55 fractions and five pure compounds obtained from actinobacteria isolated from marine sediments collected off the Madeira Archipelago. The overall predictability accuracies of the best model exceeded 63% for both training and test sets.
Keyphrases
- magnetic resonance
- high resolution
- structure activity relationship
- machine learning
- endothelial cells
- molecular docking
- small molecule
- solid state
- big data
- high throughput
- deep learning
- systematic review
- artificial intelligence
- induced pluripotent stem cells
- contrast enhanced
- signaling pathway
- heavy metals
- emergency department
- cell proliferation
- high intensity
- quality improvement
- electronic health record
- risk assessment
- virtual reality
- single cell
- resistance training
- data analysis
- molecular dynamics simulations