How does artificial intelligence master urological board examinations? A comparative analysis of different Large Language Models' accuracy and reliability in the 2022 In-Service Assessment of the European Board of Urology.
Lisa KollitschKlaus EredicsMartin MarszalekMichael RauchenwaldSabine D Brookman-MayMaximilian BurgerKatharina Körner-RiffardMatthias MayPublished in: World journal of urology (2024)
The performance of the tested LLMs in addressing urological specialist inquiries warrants further refinement. Moreover, the deficiency in response reliability contributes to existing challenges related to their current utility for educational purposes.