A visual-language foundation model for computational pathology.
Ming Y LuBowen ChenDrew F K WilliamsonRichard J ChenIvy LiangTong DingGuillaume JaumeIgor OdintsovLong Phi LeGeorg K GerberAnil V ParwaniAndrew ZhangFaisal MahmoodPublished in: Nature medicine (2024)
The accelerated adoption of digital pathology and advances in deep learning have enabled the development of robust models for various pathology tasks across a diverse array of diseases and patient cohorts. However, model training is often difficult due to label scarcity in the medical domain, and a model's usage is limited by the specific task and disease for which it is trained. Additionally, most models in histopathology leverage only image data, a stark contrast to how humans teach each other and reason about histopathologic entities. We introduce CONtrastive learning from Captions for Histopathology (CONCH), a visual-language foundation model developed using diverse sources of histopathology images, biomedical text and, notably, over 1.17 million image-caption pairs through task-agnostic pretraining. Evaluated on a suite of 14 diverse benchmarks, CONCH can be transferred to a wide range of downstream tasks involving histopathology images and/or text, achieving state-of-the-art performance on histology image classification, segmentation, captioning, and text-to-image and image-to-text retrieval. CONCH represents a substantial leap over concurrent visual-language pretrained systems for histopathology, with the potential to directly facilitate a wide array of machine learning-based workflows requiring minimal or no further supervised fine-tuning.
Keyphrases
- deep learning
- machine learning
- convolutional neural network
- artificial intelligence
- smoking cessation
- autism spectrum disorder
- big data
- magnetic resonance
- healthcare
- electronic health record
- high resolution
- air pollution
- magnetic resonance imaging
- computed tomography
- risk assessment
- case report
- body composition
- high throughput
- mass spectrometry
- locally advanced