Evaluating a Large Language Model's Ability to Answer Clinicians' Requests for Evidence Summaries.
Mallory N BlasingameTaneya Y KoonceAnnette M WilliamsDario A GiuseJing SuPoppy A KrumpNunzia Bettinsoli GiusePublished in: medRxiv : the preprint server for health sciences (2024)
Overall, the performance of a generative AI tool was promising. However, many included references could not be independently verified, and attempts were not made to assess whether any additional concepts introduced by aiChat were factually accurate. Thus, we envision this being the first of a series of investigations designed to further our understanding of how current and future versions of generative AI can be used and integrated into medical librarians' workflow.