Improved Performance of ChatGPT-4 on the OKAP Examination: A Comparative Study with ChatGPT-3.5.

Sean TeebagyLauren ColwellEmma WoodAntonio YaghyMisha Faustina

Published in: Journal of academic ophthalmology (2017) (2023)

Introduction: This study aims to evaluate the performance of ChatGPT-4, an advanced artificial intelligence (AI) language model, on the Ophthalmology Knowledge Assessment Program (OKAP) examination compared to its predecessor, ChatGPT-3.5. Methods: Both models were tested on 180 OKAP practice questions covering various ophthalmology subject categories. Results: ChatGPT-4 significantly outperformed ChatGPT-3.5 (81% vs. 57%; p <0.001), indicating improvements in medical knowledge assessment. Discussion: The superior performance of ChatGPT-4 suggests potential applicability in ophthalmologic education and clinical decision support systems. Future research should focus on refining AI models, ensuring a balanced representation of fundamental and specialized knowledge, and determining the optimal method of integrating AI into medical education and practice.

Keyphrases