Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study.
Ethan GohRobert J GalloJason HomEric StrongYingjie WengHannah KermanJoséphine A CoolZahir KanjeeAndrew Stephen ParsonsNeera K AhujaEric HorvitzDaniel YangArnold MilsteinAndrew P J OlsonAdam M RodmanJonathan H ChenPublished in: medRxiv : the preprint server for health sciences (2024)
In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.