Optimizing large language models in digestive disease: strategies and challenges to improve clinical outcomes.
Mauro GiuffrèSimone KresevicNicola PuglieseKisung YouDennis L ShungPublished in: Liver international : official journal of the International Association for the Study of the Liver (2024)
Large Language Models (LLMs) are transformer-based neural networks with billions of parameters trained on very large text corpora from diverse sources. LLMs have the potential to improve healthcare due to their capability to parse complex concepts and generate context-based responses. The interest in LLMs has not spared digestive disease academics, who have mainly investigated foundational LLM accuracy, which ranges from 25% to 90% and is influenced by the lack of standardized rules to report methodologies and results for LLM-oriented research. In addition, a critical issue is the absence of a universally accepted definition of accuracy, varying from binary to scalar interpretations, often tied to grader expertise without reference to clinical guidelines. We address strategies and challenges to increase accuracy. In particular, LLMs can be infused with domain knowledge using Retrieval Augmented Generation (RAG) or Supervised Fine-Tuning (SFT) with reinforcement learning from human feedback (RLHF). RAG faces challenges with in-context window limits and accurate information retrieval from the provided context. SFT, a deeper adaptation method, is computationally demanding and requires specialized knowledge. LLMs may increase patient quality of care across the field of digestive diseases, where physicians are often engaged in screening, treatment and surveillance for a broad range of pathologies for which in-context learning or SFT with RLHF could improve clinical decision-making and patient outcomes. However, despite their potential, the safe deployment of LLMs in healthcare still needs to overcome hurdles in accuracy, suggesting a need for strategies that integrate human feedback with advanced model training.
Keyphrases
- healthcare
- endothelial cells
- neural network
- decision making
- primary care
- palliative care
- autism spectrum disorder
- public health
- induced pluripotent stem cells
- machine learning
- quality improvement
- pluripotent stem cells
- health information
- neuropathic pain
- case report
- mass spectrometry
- virtual reality
- affordable care act
- social media
- pain management
- resistance training
- high speed