Assessing ChatGPT's Mastery of Bloom's Taxonomy Using Psychosomatic Medicine Exam Questions: Mixed-Methods Study.

Anne Herrmann-Werner Teresa Festl-Wietek Friederike Holderried Lea Herschbach Jan Griewatz Ken Masters Stephan Zipfel Moritz Mahling

Published in: Journal of medical Internet research (2024)

GPT-4 demonstrated a remarkable success rate when confronted with psychosomatic medicine multiple-choice exam questions, aligning with previous findings. When evaluated through Bloom's taxonomy, our data revealed that GPT-4 occasionally ignored specific facts (remember), provided illogical reasoning (understand), or failed to apply concepts to a new situation (apply). These errors, which were confidently presented, could be attributed to inherent model biases and the tendency to generate outputs that maximize likelihood.

Keyphrases

electronic health record
patient safety
single cell
big data
emergency department
study protocol
clinical trial
adverse drug
drug induced