Quantitative image analysis pipeline for detecting circulating hybrid cells in immunofluorescence images with human-level accuracy.
Robert T HeussnerRiley M WhalenAshley AndersonHeather TheisonJoseph BaikSummer GibbsMelissa H WongYoung Hwan ChangPublished in: Cytometry. Part A : the journal of the International Society for Analytical Cytology (2024)
Circulating hybrid cells (CHCs) are a newly discovered, tumor-derived cell population found in the peripheral blood of cancer patients and are thought to contribute to tumor metastasis. However, identifying CHCs by immunofluorescence (IF) imaging of patient peripheral blood mononuclear cells (PBMCs) is a time-consuming and subjective process that currently relies on manual annotation by laboratory technicians. Additionally, while IF is relatively easy to apply to tissue sections, its application to PBMC smears presents challenges due to the presence of biological and technical artifacts. To address these challenges, we present a robust image analysis pipeline to automate the detection and analysis of CHCs in IF images. The pipeline incorporates quality control to optimize specimen preparation protocols and remove unwanted artifacts, leverages a β-variational autoencoder (VAE) to learn meaningful latent representations of single-cell images, and employs a support vector machine (SVM) classifier to achieve human-level CHC detection. We created a rigorously labeled IF CHC data set including nine patients and two disease sites with the assistance of 10 annotators to evaluate the pipeline. We examined annotator variation and bias in CHC detection and provided guidelines to optimize the accuracy of CHC annotation. We found that all annotators agreed on CHC identification for only 65% of the cells in the data set and had a tendency to underestimate CHC counts for regions of interest (ROIs) containing relatively large amounts of cells (>50,000) when using the conventional enumeration method. On the other hand, our proposed approach is unbiased to ROI size. The SVM classifier trained on the β-VAE embeddings achieved an F1 score of 0.80, matching the average performance of human annotators. Our pipeline enables researchers to explore the role of CHCs in cancer progression and assess their potential as a clinical biomarker for metastasis. Further, we demonstrate that the pipeline can identify discrete cellular phenotypes among PBMCs, highlighting its utility beyond CHCs.
Keyphrases
- induced apoptosis
- endothelial cells
- cell cycle arrest
- single cell
- deep learning
- peripheral blood
- end stage renal disease
- quality control
- oxidative stress
- rna seq
- high resolution
- stem cells
- endoplasmic reticulum stress
- loop mediated isothermal amplification
- induced pluripotent stem cells
- convolutional neural network
- mesenchymal stem cells
- bone marrow
- optical coherence tomography
- electronic health record
- peritoneal dialysis
- magnetic resonance imaging
- big data
- body composition
- newly diagnosed
- mass spectrometry
- high throughput
- real time pcr
- signaling pathway
- patient reported outcomes
- ejection fraction
- artificial intelligence
- computed tomography
- liquid chromatography