Instance segmentation for whole slide imaging: end-to-end or detect-then-segment.

Aadarsh Jha Hai-Chun Yang Ruining Deng Meghan Elizabeth KappAgnes B FogoYuankai Huo

Published in: Journal of medical imaging (Bellingham, Wash.) (2021)

Purpose: Automatic instance segmentation of glomeruli within kidney whole slide imaging (WSI) is essential for clinical research in renal pathology. In computer vision, the end-to-end instance segmentation methods (e.g., Mask-RCNN) have shown their advantages relative to detect-then-segment approaches by performing complementary detection and segmentation tasks simultaneously. As a result, the end-to-end Mask-RCNN approach has been the de facto standard method in recent glomerular segmentation studies, where downsampling and patch-based techniques are used to properly evaluate the high-resolution images from WSI (e.g., > 10,000 × 10,000 pixels on 40 × ). However, in high-resolution WSI, a single glomerulus itself can be more than 1000 × 1000 pixels in original resolution which yields significant information loss when the corresponding features maps are downsampled to the 28 × 28 resolution via the end-to-end Mask-RCNN pipeline. Approach: We assess if the end-to-end instance segmentation framework is optimal for high-resolution WSI objects by comparing Mask-RCNN with our proposed detect-then-segment framework. Beyond such a comparison, we also comprehensively evaluate the performance of our detect-then-segment pipeline through: (1) two of the most prevalent segmentation backbones (U-Net and DeepLab_v3); (2) six different image resolutions ( 512 × 512 , 256 × 256 , 128 × 128 , 64 × 64 , 32 × 32 , and 28 × 28 ); and (3) two different color spaces (RGB and LAB). Results: Our detect-then-segment pipeline, with the DeepLab_v3 segmentation framework operating on previously detected glomeruli of 512 × 512 resolution, achieved a 0.953 Dice similarity coefficient (DSC), compared with a 0.902 DSC from the end-to-end Mask-RCNN pipeline. Further, we found that neither RGB nor LAB color spaces yield better performance when compared against each other in the context of a detect-then-segment framework. Conclusions: The detect-then-segment pipeline achieved better segmentation performance compared with the end-to-end method. Our study provides an extensive quantitative reference for other researchers to select the optimized and most accurate segmentation approach for glomeruli, or other biological objects of similar character, on high-resolution WSI.

Keyphrases