Repeatability, reproducibility, and diagnostic accuracy of a commercial large language model (ChatGPT) to perform emergency department triage using the Canadian triage and acuity scale.

Jeffrey Michael FrancLenard ChengAlexander HartRyan HataAtilla Hertelendy

Published in: CJEM (2024)

This study suggests that the current ChatGPT large language model is not sufficient for emergency physicians to triage simulated patients using the Canadian Triage and Acuity Scale due to poor repeatability and accuracy. Medical practitioners should be aware that while ChatGPT can be a valuable tool, it may lack consistency and may frequently provide false information.

Keyphrases

emergency department
end stage renal disease
primary care
autism spectrum disorder
healthcare
newly diagnosed
chronic kidney disease
ejection fraction
peritoneal dialysis
prognostic factors
public health
adverse drug
patient reported outcomes
health information
general practice
electronic health record
patient reported