Learning To Predict Reaction Conditions: Relationships between Solvent, Molecular Structure, and Catalyst.
Eric WalkerJoshua A KammeraadJonathan GoetzMichael T RoboAmbuj TewariPaul M ZimmermanPublished in: Journal of chemical information and modeling (2019)
Reaction databases provide a great deal of useful information to assist planning of experiments but do not provide any interpretation or chemical concepts to accompany this information. In this work, reactions are labeled with experimental conditions, and network analysis shows that consistencies within clusters of data points can be leveraged to organize this information. In particular, this analysis shows how particular experimental conditions (specifically solvent) are effective in enabling specific organic reactions (Friedel-Crafts, Aldol addition, Claisen condensation, Diels-Alder, and Wittig), including variations within each reaction class. Network analysis shows data points for reactions tend to break into clusters that depend on the catalyst and chemical structure. This type of clustering, which mimics how a chemist reasons, is derived directly from the network. Therefore, the findings of this work could augment synthesis planning by providing predictions in a fashion that mimics human chemists. To numerically evaluate solvent prediction ability, three methods are compared: network analysis (through the k-nearest neighbor algorithm), a support vector machine, and a deep neural network. The most accurate method in 4 of the 5 test cases is the network analysis, with deep neural networks also showing good prediction scores. The network analysis tool was evaluated by an expert panel of chemists, who generally agreed that the algorithm produced accurate solvent choices while simultaneously being transparent in the underlying reasons for its predictions.
Keyphrases
- network analysis
- neural network
- ionic liquid
- big data
- room temperature
- deep learning
- endothelial cells
- electronic health record
- machine learning
- highly efficient
- high resolution
- healthcare
- single cell
- solar cells
- metal organic framework
- computed tomography
- artificial intelligence
- social media
- single molecule
- induced pluripotent stem cells
- water soluble
- positron emission tomography