The performance of OpenAI ChatGPT-4 and Google Gemini in virology multiple-choice questions: a comparative analysis of English and Arabic responses.

Malik Sallam Kholoud Al-Mahzoum Rawan Ahmad Almutawaa Jasmen Ahmad Alhashash Retaj Abdullah Dashti Danah Raed AlSafy Reem Abdullah Almutairi Muna M Barakat

Published in: BMC research notes (2024)

ChatGPT-4 and Gemini performed better in English compared to Arabic, with ChatGPT-4 consistently surpassing Gemini in correctness and CLEAR scores. ChatGPT-4 led Gemini with 80% vs. 62.5% correctness in English compared to 65% vs. 55% in Arabic. For both AI models, superior performance in lower cognitive domains was reported. Both ChatGPT-4 and Gemini exhibited potential in educational applications; nevertheless, their performance varied across languages highlighting the importance of continued development to ensure the effective AI integration in healthcare education globally.

Keyphrases

healthcare
psychometric properties
artificial intelligence
machine learning
deep learning
social media