Login / Signup

From vision to text: A comprehensive review of natural image captioning in medical diagnosis and radiology report generation.

Gabriel Reale-NoseiElvira Amador-DomínguezEmilio Serrano
Published in: Medical image analysis (2024)
Natural Image Captioning (NIC) is an interdisciplinary research area that lies within the intersection of Computer Vision (CV) and Natural Language Processing (NLP). Several works have been presented on the subject, ranging from the early template-based approaches to the more recent deep learning-based methods. This paper conducts a survey in the area of NIC, especially focusing on its applications for Medical Image Captioning (MIC) and Diagnostic Captioning (DC) in the field of radiology. A review of the state-of-the-art is conducted summarizing key research works in NIC and DC to provide a wide overview on the subject. These works include existing NIC and MIC models, datasets, evaluation metrics, and previous reviews in the specialized literature. The revised work is thoroughly analyzed and discussed, highlighting the limitations of existing approaches and their potential implications in real clinical practice. Similarly, future potential research lines are outlined on the basis of the detected limitations.
Keyphrases