In minimally invasive surgery videos, label-free monocular laparoscopic depth estimation is challenging due to smoke. For this reason, we propose a self-supervised collaborative network-based depth estimation method with smoke-removal for monocular endoscopic video, which is decomposed into two steps of smoke-removal and depth estimation. In the first step, we develop a de-endoscopic smoke for cyclic GAN (DS-cGAN) to mitigate the smoke components at different concentrations. The designed generator network comprises sharpened guide encoding module (SGEM), residual dense bottleneck module (RDBM) and refined upsampling convolution module (RUCM), which restores more detailed organ edges and tissue structures. In the second step, high resolution residual U-Net (HRR-UNet) consisting of a DepthNet and two PoseNets is designed to improve the depth estimation accuracy, and adjacent frames are used for camera self-motion estimation. In particular, the proposed method requires neither manual labeling nor patient computed tomography scans during the training and inference phases. Experimental studies on the laparoscopic data set of the Hamlyn Centre show that our method can effectively achieve accurate depth information after net smoking in real surgical scenes while preserving the blood vessels, contours and textures of the surgical site. The experimental results demonstrate that the proposed method outperforms existing state-of-the-art methods in effectiveness and achieves a frame rate of 94.45fps in real time, making it a promising clinical application.
Keyphrases
- optical coherence tomography
- high resolution
- computed tomography
- ultrasound guided
- machine learning
- label free
- randomized controlled trial
- robot assisted
- systematic review
- magnetic resonance imaging
- high speed
- healthcare
- mass spectrometry
- magnetic resonance
- contrast enhanced
- electronic health record
- minimally invasive
- positron emission tomography
- quality improvement
- endoscopic submucosal dissection
- network analysis