Login / Signup

Extracting the Synthetic Route of Pd-Based Catalysts in Methanol Steam Reforming from the Scientific Literature.

Shuyuan LiYunjiang ZhangZhaolin FangKong MengRui TianHong HeShaorui Sun
Published in: Journal of chemical information and modeling (2023)
The structured material synthesis route is crucial for chemists in performing experiments and modern applications such as machine learning material design. With the exponential growth of the chemical literature in recent years, manual extraction from the published literature is time-consuming and labor-intensive. This study focuses on developing an automated method for extracting Pd-based catalyst synthesis routes from the chemical literature. First, a paragraph classification model based on regular expressions is employed to identify paragraphs that contain material synthesis processes. The identified paragraphs are verified using machine learning techniques. Second, natural language processing techniques are applied to automatically parse the material synthesis routes from the identified paragraphs, generate regularized flowcharts, and output structured data. Lastly, we utilized the structured data of the synthesis routes to train machine learning models and predict the performance of the materials. The extracted material entities include the product, preparation method, precursor, support, loading, synthesis operation, and operation condition. This method avoids extensive manual data annotation and improves the scientific literature information acquisition efficiency. The accuracy of the 11 material entities exceeds 80%, and the accuracy of the method, support, precursor, drying time, and reduction time exceeds 90%.
Keyphrases