Impact of high-quality, mixed-domain data on the performance of medical language models.
Maxime GriotCoralie HemptinneJean VanderdoncktDemet YukselPublished in: Journal of the American Medical Informatics Association : JAMIA (2024)
This study sets a new standard in medical language models, proving that a strategically trained, smaller model can outperform larger ones in clinical relevance and general proficiency, highlighting the importance of data quality and expert curation in generative artificial intelligence for healthcare applications.