Repeatability, reproducibility, and diagnostic accuracy of a commercial large language model (ChatGPT) to perform emergency department triage using the Canadian triage and acuity scale.
Jeffrey Michael FrancLenard ChengAlexander HartRyan HataAtilla HertelendyPublished in: CJEM (2024)
This study suggests that the current ChatGPT large language model is not sufficient for emergency physicians to triage simulated patients using the Canadian Triage and Acuity Scale due to poor repeatability and accuracy. Medical practitioners should be aware that while ChatGPT can be a valuable tool, it may lack consistency and may frequently provide false information.
Keyphrases
- emergency department
- end stage renal disease
- primary care
- autism spectrum disorder
- healthcare
- newly diagnosed
- chronic kidney disease
- ejection fraction
- peritoneal dialysis
- prognostic factors
- public health
- adverse drug
- patient reported outcomes
- health information
- general practice
- electronic health record
- patient reported