A chemical reaction entity recognition method based on a natural language data augmentation strategy.
Xiaowen ZhangYang LiChaoyi LiJingyuan ZhuZhiqiang GanLei WangXiaofei SunHengzhi YouPublished in: Chemical communications (Cambridge, England) (2024)
Impressive applications of artificial intelligence in the field of chemical reaction prediction heavily depend on abundant reliable datasets. The automated extraction of reaction procedures to build structured chemical databases is of growing importance. Here, we propose a novel model named DACRER for large-scale reaction extraction, in which transfer learning and a data augmentation strategy were employed. This model was evaluated for chemical datasets and shows good performance in identifying and processing chemical texts.