Login / Signup

Traditional Chinese Medicine Text Similarity Calculation Model Based on the Bidirectional Temporal Siamese Network.

Jigen LuoWangping XiongJianqiang DuYingfeng LiuJianwen LiDingxing Hu
Published in: Evidence-based complementary and alternative medicine : eCAM (2021)
The text similarity calculation plays a crucial role as the core work of artificial intelligence commercial applications such as traditional Chinese medicine (TCM) auxiliary diagnosis, intelligent question and answer, and prescription recommendation. However, TCM texts have problems such as short sentence expression, inaccurate word segmentation, strong semantic relevance, high feature dimension, and sparseness. This study comprehensively considers the temporal information of sentence context and proposes a TCM text similarity calculation model based on the bidirectional temporal Siamese network (BTSN). We used the enhanced representation through knowledge integration (ERNIE) pretrained language model to train character vectors instead of word vectors and solved the problem of inaccurate word segmentation in TCM. In the Siamese network, the traditional fully connected neural network was replaced by a deep bidirectional long short-term memory (BLSTM) to capture the contextual semantics of the current word information. The improved similarity BLSTM was used to map the sentence that is to be tested into two sets of low-dimensional numerical vectors. Then, we performed similarity calculation training. Experiments on the two datasets of financial and TCM show that the performance of the BTSN model in this study was better than that of other similarity calculation models. When the number of layers of the BLSTM reached 6 layers, the accuracy of the model was the highest. This verifies that the text similarity calculation model proposed in this study has high engineering value.
Keyphrases