Evaluating Large Language Models for Automated Reporting and Data Systems Categorization: Cross-Sectional Study.
Qingxia WuQingxia WuHuali LiYan WangYan BaiYa-Ping WuXuan YuXiaodong LiPei DongZhong XueDinggang ShenMeiyun WangPublished in: JMIR medical informatics (2024)
When equipped with structured prompts and guideline PDFs, Claude-2 demonstrated potential in assigning RADS categories to radiology cases according to established criteria such as LI-RADS version 2018. However, the current generation of chatbots lags in accurately categorizing cases based on more recent RADS criteria.