Leveraging Large Language Models (LLM) for the Plastic Surgery Resident Training: Do They Have a Role?

Devi Prasad MohapatraFriji Meethale ThiruvothSatyaswarup Tripathy Sheeja Rajan T MMadhubari VathulyaPalukuri LakshmiVeena K Singh Ansar Ul Haq

Published in: Indian journal of plastic surgery : official publication of the Association of Plastic Surgeons of India (2023)

Introduction Large language models (LLMs) are designed for recognizing, summarizing, translating, predicting, and generating text-based content from knowledge gained from extensive data sets. ChatGPT4 (Generative Pre-trained Transformer 4) (OpenAI, San Francisco, California, United States) is a transformer-based LLM model pretrained on public data as well as data obtained from third-party sources using deep learning techniques of fine tuning and reinforcement learning from human feedback to predict the next text. We wanted to explore the role of LLM as a teaching assistant (TA) in plastic surgery. Material and Methods TA roles were first identified in available literature, and based on the roles, a list of suitable tasks was created where LLM could be used to perform the task. Prompts designed to be fed in to the LLM (specifically ChatGPT) to generate appropriate output, were then created and fed to the ChatGPT model. The outputs generated were scored by evaluators and compared for interobserver agreement. Results A final set of eight TA roles were identified where a LLM could be utilized to generate content. These contents were scored for usefulness and accuracy. These were scored independently by the eight study authors in a scoring sheet created for the study. Interobserver agreements for content accuracy, usefulness, and clarity were 100% for content generated for the following: interactive case studies (generation), simulation of preoperative consultations, and generation of ethical considerations. Discussion LLMs in general and ChatGPT (on which this study is based) in specific, can generate answers to questions and prompts based on huge amount of text fed into the model for training the underlying language model. The answers generated have been found to be accurate, readable, and even indistinguishable from human-generated text. This capability of automated content synthesis can be exploited to generate summaries to text, answer short and long answers, and generate case scenarios. We could identify a few such scenarios where the LLM could in general be utilized to play the role of a TA and aid plastic surgery residents in particular. In addition, these models could also be used by students to obtain feedback and gain reflection which itself stimulates critical thinking. Conclusion Incorporating LLMs into the educational arsenal of plastic surgery residency programs can provide a dynamic, interactive, and individualized learning experience for residents and prove to be worthy TAs of future.

Keyphrases