Artificial Intelligence in Postoperative Care: Assessing Large Language Models for Patient Recommendations in Plastic Surgery.
Cesar A Gomez-CabelloSahar BornaSophia M PressmanSyed Ali HaiderAjai SehgalBradley C LeibovichAntonio Jorge FortePublished in: Healthcare (Basel, Switzerland) (2024)
Since their release, the medical community has been actively exploring large language models' (LLMs) capabilities, which show promise in providing accurate medical knowledge. One potential application is as a patient resource. This study analyzes and compares the ability of the currently available LLMs, ChatGPT-3.5, GPT-4, and Gemini, to provide postoperative care recommendations to plastic surgery patients. We presented each model with 32 questions addressing common patient concerns after surgical cosmetic procedures and evaluated the medical accuracy, readability, understandability, and actionability of the models' responses. The three LLMs provided equally accurate information, with GPT-3.5 averaging the highest on the Likert scale (LS) (4.18 ± 0.93) ( p = 0.849), while Gemini provided significantly more readable ( p = 0.001) and understandable responses ( p = 0.014; p = 0.001). There was no difference in the actionability of the models' responses ( p = 0.830). Although LLMs have shown their potential as adjunctive tools in postoperative patient care, further refinement and research are imperative to enable their evolution into comprehensive standalone resources.
Keyphrases
- healthcare
- artificial intelligence
- case report
- patients undergoing
- big data
- palliative care
- machine learning
- autism spectrum disorder
- end stage renal disease
- quality improvement
- ejection fraction
- health information
- newly diagnosed
- mental health
- prognostic factors
- clinical practice
- chronic pain
- pain management
- affordable care act
- patient reported
- patient reported outcomes