Performance of ChatGPT on the Peruvian National Licensing Medical Examination: Cross-Sectional Study.

Javier Alejandro Flores-Cohaila Abigaíl García-Vicente Sonia F Vizcarra-Jiménez Janith P De la Cruz-Galán Jesús D Gutiérrez-Arratia Blanca Geraldine Quiroga Torres Álvaro Taype-Rondan

Published in: JMIR medical education (2023)

Our study found that ChatGPT (GPT-3.5 and GPT-4) can achieve expert-level performance on the ENAM, outperforming most of our examinees. We found fair agreement between both GPT-3.5 and GPT-4. Incorrect answers were associated with the difficulty of questions, which may resemble human performance. Furthermore, by reinputting questions that initially received incorrect answers with different prompts containing additional roles and context, ChatGPT achieved improved accuracy.

Keyphrases

endothelial cells
healthcare
clinical practice