Triage Performance Across Large Language Models, ChatGPT, and Untrained Doctors in Emergency Medicine: Comparative Study.

Lars Masanneck Linea Schmidt Antonia Seifert Tristan Kölsche Niklas Huntemann Robin Jansen Mohammed Mehsin Michael Bernhard Sven Guenther Meuth Lennert Böhm Marc Pawlitzki

Published in: Journal of medical Internet research (2024)

While LLMs and the LLM-based product ChatGPT do not yet match professionally trained raters, their best models' triage proficiency equals that of untrained ED doctors. In its current form, LLMs or ChatGPT thus did not demonstrate gold-standard performance in ED triage and, in the setting of this study, failed to significantly improve untrained doctors' triage when used as decision support. Notable performance enhancements in newer LLM versions over older ones hint at future improvements with further technological development and specific training.

Keyphrases

emergency department
resistance training
emergency medicine
medical students
body composition
autism spectrum disorder
physical activity
high intensity
current status