Diagnostic accuracy of GPT-4 on common clinical scenarios and challenging cases.
Geoffrey W RutledgePublished in: Learning health systems (2024)
GPT-4 performs at a level at least as good as, if not better than, that of experienced physicians on highly challenging cases in internal medicine. The extraordinary performance of GPT-4 on diagnosing common clinical scenarios could be explained in part by the fact that these cases were previously published and may have been included in the training dataset for this LLM.