Large Language Models in Medical Education: Comparing ChatGPT- to Human-Generated Exam Questions.
Matthias Carl LaupichlerJohanna Flora RotherIlona C Grunwald KadowSeifollah AhmadiTobias RaupachPublished in: Academic medicine : journal of the Association of American Medical Colleges (2023)
Future research should replicate the study procedure in other contexts (e.g., other medical subjects, semesters, countries, and languages). In addition, the question of whether LLMs are suitable for generating different question types, such as key feature questions, should be investigated.