Large language models in medical ethics: useful but not expert.

Published in: Journal of medical ethics (2024)

Large language models (LLMs) have now entered the realm of medical ethics. In a recent study, Balas et al examined the performance of GPT-4, a commercially available LLM, assessing its performance in generating responses to diverse medical ethics cases. Their findings reveal that GPT-4 demonstrates an ability to identify and articulate complex medical ethical issues, although its proficiency in encoding the depth of real-world ethical dilemmas remains an avenue for improvement. Investigating the integration of LLMs into medical ethics decision-making appears to be an interesting avenue of research. However, despite the promising trajectory of LLM technology in medicine, it is crucial to exercise caution and refrain from attributing their expertise to medical ethics. Our thesis follows an examination of the nature of expertise and the epistemic limitations that affect LLM technology. As a result, we propose two more fitting applications of LLMs in medical ethics: first, as tools for mining electronic health records or scientific literature, thereby supplementing evidence for resolving medical ethics cases, and second, as educational platforms to foster ethical reflection and critical thinking skills among students and residents. The integration of LLMs in medical ethics, while promising, requires careful consideration of their epistemic limitations. Consequently, a well-considered definition of their role in ethically sensitive decision-making is crucial.

Keyphrases