Deep Transferable Compound Representation across Domains and Tasks for Low Data Drug Discovery.
Karim AbbasiAntti PosoJahanbakhsh GhasemiMassoud AmanlouAli Masoudi-NejadPublished in: Journal of chemical information and modeling (2019)
The main problem of small molecule-based drug discovery is to find a candidate molecule with increased pharmacological activity, proper ADME, and low toxicity. Recently, machine learning has driven a significant contribution to drug discovery. However, many machine learning methods, such as deep learning-based approaches, require a large amount of training data to form accurate predictions for unseen data. In lead optimization step, the amount of available biological data on small molecule compounds is low, which makes it a challenging problem to apply machine learning methods. The main goal of this study is to design a new approach to handle these situations. To this end, source assay (auxiliary assay) knowledge is utilized to learn a better model to predict the property of new compounds in the target assay. Up to now, the current approaches did not consider that source and target assays are adapted to different target groups with different compounds distribution. In this paper, we propose a new architecture by utilizing graph convolutional network and adversarial domain adaptation network to tackle this issue. To evaluate the proposed approach, we applied it to Tox21, ToxCast, SIDER, HIV, and BACE collections. The results showed the effectiveness of the proposed approach in transferring the related knowledge from source to target data set.
Keyphrases
- drug discovery
- machine learning
- small molecule
- big data
- electronic health record
- deep learning
- high throughput
- artificial intelligence
- healthcare
- oxidative stress
- systematic review
- randomized controlled trial
- data analysis
- working memory
- human immunodeficiency virus
- mass spectrometry
- high resolution
- convolutional neural network
- hiv positive
- oxide nanoparticles