Influence of a Large Language Model on Diagnostic Reasoning: A Randomized Clinical Vignette Study.

Ethan Goh Robert J Gallo Jason Hom Eric Strong Yingjie Weng Hannah Kerman Joséphine A Cool Zahir Kanjee Andrew Stephen Parsons Neera K Ahuja Eric HorvitzDaniel YangArnold Milstein Andrew P J Olson Adam M Rodman Jonathan H Chen

Published in: medRxiv : the preprint server for health sciences (2024)

In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.

Keyphrases

primary care
emergency department
clinical practice
autism spectrum disorder
machine learning
deep learning