Can Large Language Models (LLMs) Predict the Appropriate Treatment of Acute Hip Fractures in Older Adults? Comparing Appropriate Use Criteria With Recommendations From ChatGPT.

Katrina S NietschNancy ShresthaLaura C Mazudie NdjonkoWasil AhmedMateo Restrepo MejiaBashar ZaidatRenee RenAkiro H DueySamuel Q LiJun S KimKrystin A HiddenSamuel K Cho

Published in: Journal of the American Academy of Orthopaedic Surgeons. Global research & reviews (2024)

ChatGPT-4.0 scores were not concordant with AAOS scores, overestimating the appropriateness of total hip arthroplasty, hemiarthroplasty, and long cephalomedullary nails, and underestimating the other three. ChatGPT-4.0 was inadequate in selecting an appropriate treatment deemed acceptable, most reasonable, and most likely to improve patient outcomes.

Keyphrases

total hip arthroplasty
physical activity
autism spectrum disorder
liver failure
hepatitis b virus
drug induced
replacement therapy
mechanical ventilation