Inter- and Intra-Observer Variability and the Effect of Experience in Cine-MRI for Adhesion Detection.
Bram de WildeFrank JoostenWulphert VenderinkMirjam E J DavidseJuliëtte GeurtsHanneke KruijtAfke VermeulenBibi MartensMaxime V P SchynsJosephine C B M HuigeMyrte C de BoerBart A R ToninoHerman J A ZandvoortKirsti LammertHelka ParviainenAino-Maija VuorinenSuvi SyvärantaRuben R M VogelsWiesje PrinsAndrea CoppolaNancy BossaRichard Peter Gerardus Ten BroekHenkjan HuismanPublished in: Journal of imaging (2023)
Cine-MRI for adhesion detection is a promising novel modality that can help the large group of patients developing pain after abdominal surgery. Few studies into its diagnostic accuracy are available, and none address observer variability. This retrospective study explores the inter- and intra-observer variability, diagnostic accuracy, and the effect of experience. A total of 15 observers with a variety of experience reviewed 61 sagittal cine-MRI slices, placing box annotations with a confidence score at locations suspect for adhesions. Five observers reviewed the slices again one year later. Inter- and intra-observer variability are quantified using Fleiss' (inter) and Cohen's (intra) κ and percentage agreement. Diagnostic accuracy is quantified with receiver operating characteristic (ROC) analysis based on a consensus standard. Inter-observer Fleiss' κ values range from 0.04 to 0.34, showing poor to fair agreement. High general and cine-MRI experience led to significantly ( p < 0.001) better agreement among observers. The intra-observer results show Cohen's κ values between 0.37 and 0.53 for all observers, except one with a low κ of -0.11. Group AUC scores lie between 0.66 and 0.72, with individual observers reaching 0.78. This study confirms that cine-MRI can diagnose adhesions, with respect to a radiologist consensus panel and shows that experience improves reading cine-MRI. Observers without specific experience adapt to this modality quickly after a short online tutorial. Observer agreement is fair at best and area under the receiver operating characteristic curve (AUC) scores leave room for improvement. Consistently interpreting this novel modality needs further research, for instance, by developing reporting guidelines or artificial intelligence-based methods.
Keyphrases
- contrast enhanced
- magnetic resonance imaging
- artificial intelligence
- diffusion weighted imaging
- magnetic resonance
- computed tomography
- newly diagnosed
- ejection fraction
- emergency department
- transcription factor
- chronic pain
- escherichia coli
- working memory
- social media
- deep learning
- label free
- loop mediated isothermal amplification