On the limitations of large language models in clinical diagnosis.

Justin T Reese Daniel Danis J Harry Caufield Elena CasiraghiGiorgio ValentiniChristopher John Mungall Peter Nick Robinson

Published in: medRxiv : the preprint server for health sciences (2023)

We consider the feature-based queries to be a more appropriate test of the performance of GPT-4 in diagnostic tasks, since it is unlikely that the narrative approach can be used in actual clinical practice. Future research and algorithmic development is needed to determine the optimal approach to leveraging LLMs for clinical diagnosis.

Keyphrases

clinical practice
machine learning
autism spectrum disorder
working memory
deep learning