Natural Image Captioning (NIC) is an interdisciplinary research area that lies within the intersection of Computer Vision (CV) and Natural Language Processing (NLP). Several works have been presented on the subject, ranging from the early template-based approaches to the more recent deep learning-based methods. This paper conducts a survey in the area of NIC, especially focusing on its applications for Medical Image Captioning (MIC) and Diagnostic Captioning (DC) in the field of radiology. A review of the state-of-the-art is conducted summarizing key research works in NIC and DC to provide a wide overview on the subject. These works include existing NIC and MIC models, datasets, evaluation metrics, and previous reviews in the specialized literature. The revised work is thoroughly analyzed and discussed, highlighting the limitations of existing approaches and their potential implications in real clinical practice. Similarly, future potential research lines are outlined on the basis of the detected limitations.
Keyphrases
- deep learning
- artificial intelligence
- convolutional neural network
- clinical practice
- healthcare
- machine learning
- dendritic cells
- systematic review
- human health
- palliative care
- autism spectrum disorder
- current status
- rna seq
- smoking cessation
- finite element
- high resolution
- mass spectrometry
- immune response
- tandem mass spectrometry