Login / Signup

A Context-based Chatbot Surpasses Trained Radiologists and Generic ChatGPT in Following the ACR Appropriateness Guidelines.

Alexander RauStephan RauDaniela ZoellerAnna FinkHien TranCaroline WilpertJohanna NattenmüllerJakob NeubauerFabian BambergMarco ReisertMaximilian Frederik Russe
Published in: Radiology (2023)
Background Radiological imaging guidelines are crucial for accurate diagnosis and optimal patient care as they result in standardized decisions and thus reduce inappropriate imaging studies. Purpose In the present study, we investigated the potential to support clinical decision-making using an interactive chatbot designed to provide personalized imaging recommendations from American College of Radiology (ACR) appropriateness criteria documents using semantic similarity processing. Methods We utilized 209 ACR appropriateness criteria documents as specialized knowledge base and employed LlamaIndex, a framework that allows to connect large language models with external data, and the ChatGPT 3.5-Turbo to create an appropriateness criteria contexted chatbot (accGPT). Fifty clinical case files were used to compare the accGPT's performance against general radiologists at varying experience levels and to generic ChatGPT 3.5 and 4.0. Results All chatbots reached at least human performance level. For the 50 case files, the accGPT performed best in providing correct recommendations that were "usually appropriate" according to the ACR criteria and also did provide the highest proportion of consistently correct answers in comparison with generic chatbots and radiologists. Further, the chatbots provided substantial time and cost savings, with an average decision time of 5 minutes and a cost of 0.19 € for all cases, compared to 50 minutes and 29.99 € for radiologists (both p < 0.01). Conclusion ChatGPT-based algorithms have the potential to substantially improve the decision-making for clinical imaging studies in accordance with ACR guidelines. Specifically, a context-based algorithm performed superior to its generic counterpart, demonstrating the value of tailoring AI solutions to specific healthcare applications.
Keyphrases