Benchmarking Effectiveness and Efficiency of Deep Learning Models for Semantic Textual Similarity in the Clinical Domain: Validation Study.

Qingyu Chen Alex Rankine Yifan Peng Elaheh Aghaarabi Zhiyong Lu

Published in: JMIR medical informatics (2021)

Despite the excitement of further improving Pearson correlations in this data set, our results highlight that evaluations of the effectiveness and efficiency of STS models are critical. In future, we suggest more evaluations on the generalization capability and user-level testing of the models. We call for community efforts to create more biomedical and clinical STS data sets from different perspectives to reflect the multifaceted notion of sentence-relatedness.

Keyphrases

deep learning
randomized controlled trial
systematic review
electronic health record
healthcare
big data
mental health
artificial intelligence