Self-Supervised Contrastive Molecular Representation Learning with a Chemical Synthesis Knowledge Graph.
Jiancong XieYi WangJiahua RaoShuangjia ZhengYuedong YangPublished in: Journal of chemical information and modeling (2024)
Self-supervised molecular representation learning has demonstrated great promise in bridging machine learning and chemical science to accelerate the development of new drugs. Due to the limited reaction data, existing methods are mostly pretrained by augmenting the intrinsic topology of molecules without effectively incorporating chemical reaction prior information, which makes them difficult to generalize to chemical reaction-related tasks. To address this issue, we propose ReaKE, a reaction knowledge embedding framework, which formulates chemical reactions as a knowledge graph. Specifically, we constructed a chemical synthesis knowledge graph with reactants and products as nodes and reaction rules as the edges. Based on the knowledge graph, we further proposed novel contrastive learning at both molecule and reaction levels to capture the reaction-related functional group information within and between molecules. Extensive experiments demonstrate the effectiveness of ReaKE compared with state-of-the-art methods on several downstream tasks, including reaction classification, product prediction, and yield prediction.
Keyphrases
- machine learning
- healthcare
- randomized controlled trial
- big data
- neural network
- systematic review
- public health
- convolutional neural network
- electron transfer
- working memory
- deep learning
- early stage
- radiation therapy
- wastewater treatment
- squamous cell carcinoma
- electronic health record
- rectal cancer
- health information
- lymph node
- data analysis
- neoadjuvant chemotherapy
- locally advanced