Comparison of automated full-body bone metastases delineation methods and their corresponding prognostic power.
Brayden SchottAmy J WeismanTimothy G PerkAlison R RothGlenn LiuRobert JerajPublished in: Physics in medicine and biology (2023)
Objective. Manual disease delineation in full-body imaging of patients with multiple metastases is often impractical due to high disease burden. However, this is a clinically relevant task as quantitative image techniques assessing individual metastases, while limited, have been shown to be predictive of treatment outcome. The goal of this work was to evaluate the efficacy of deep learning-based methods for full-body delineation of skeletal metastases and to compare their performance to existing methods in terms of disease delineation accuracy and prognostic power. Approach. 1833 suspicious lesions on 37 18 F-NaF PET/CT scans of patients with metastatic castration-resistant prostate cancer (mCRPC) were contoured and classified as malignant, equivocal, or benign by a nuclear medicine physician. Two convolutional neural network (CNN) architectures (DeepMedic and nnUNet ) were trained to delineate malignant disease regions with and without three-model ensembling. Malignant disease contours using previously established methods were obtained. The performance of each method was assessed in terms of four different tasks: (1) detection, (2) segmentation, (3) PET SUV metric correlations with physician-based data, and (4) prognostic power of progression-free survival. Main Results. The nnUnet three-model ensemble achieved superior detection performance with a mean (+/- standard deviation) sensitivity of 82.9±ccc 0.1% at the selected operating point. The nnUnet single and three-model ensemble achieved comparable segmentation performance with a mean Dice coefficient of 0.80±0.12 and 0.79±0.12, respectively, both outperforming other methods. The nnUNet ensemble achieved comparable or superior SUV metric correlation performance to gold-standard data. Despite superior disease delineation performance, the nnUNet methods did not display superior prognostic power over other methods. Significance. This work showed that CNN-based (nnUNet) methods are superior to the non-CNN methods for mCRPC disease delineation in full-body 18 F-NaF PET/CT. The CNN-based methods, however, do not hold greater prognostic power for predicting clinical outcome. This merits more investigation on the optimal selection of delineation methods for specific clinical tasks.
Keyphrases
- convolutional neural network
- pet ct
- deep learning
- emergency department
- primary care
- positron emission tomography
- machine learning
- computed tomography
- high resolution
- magnetic resonance imaging
- artificial intelligence
- body composition
- high intensity
- sensitive detection
- resistance training
- diffusion weighted imaging