Login / Signup

Deep Learning Model for Automated Detection and Classification of Central Canal, Lateral Recess, and Neural Foraminal Stenosis at Lumbar Spine MRI.

James Thomas Patrick Decourcy HallinanLei ZhuKaiyuan YangAndrew MakmurDiyaa Abdul Rauf AlgazwiYee Liang ThianSamuel LauYun Song ChooSterling Ellis EideQai Ven YapYiong-Huak ChanJiong Hao TanAravind KumarBeng-Chin OoiHiroshi YoshiokaSwee Tian Quek
Published in: Radiology (2021)
Background Assessment of lumbar spinal stenosis at MRI is repetitive and time consuming. Deep learning (DL) could improve -productivity and the consistency of reporting. Purpose To develop a DL model for automated detection and classification of lumbar central canal, lateral recess, and neural -foraminal stenosis. Materials and Methods In this retrospective study, lumbar spine MRI scans obtained from September 2015 to September 2018 were included. Studies of patients with spinal instrumentation or studies with suboptimal image quality, as well as postgadolinium studies and studies of patients with scoliosis, were excluded. Axial T2-weighted and sagittal T1-weighted images were used. Studies were split into an internal training set (80%), validation set (9%), and test set (11%). Training data were labeled by four radiologists using predefined gradings (normal, mild, moderate, and severe). A two-component DL model was developed. First, a convolutional neural network (CNN) was trained to detect the region of interest (ROI), with a second CNN for classification. An internal test set was labeled by a musculoskeletal radiologist with 31 years of experience (reference standard) and two subspecialist radiologists (radiologist 1: A.M., 5 years of experience; radiologist 2: J.T.P.D.H., 9 years of experience). DL model performance on an external test set was evaluated. Detection recall (in percentage), interrater agreement (Gwet κ), sensitivity, and specificity were calculated. Results Overall, 446 MRI lumbar spine studies were analyzed (446 patients; mean age ± standard deviation, 52 years ± 19; 240 women), with 396 patients in the training (80%) and validation (9%) sets and 50 (11%) in the internal test set. For internal testing, DL model and radiologist central canal recall were greater than 99%, with reduced neural foramina recall for the DL model (84.5%) and radiologist 1 (83.9%) compared with radiologist 2 (97.1%) (P < .001). For internal testing, dichotomous classification (normal or mild vs moderate or severe) showed almost-perfect agreement for both radiologists and the DL model, with respective κ values of 0.98, 0.98, and 0.96 for the central canal; 0.92, 0.95, and 0.92 for lateral recesses; and 0.94, 0.95, and 0.89 for neural foramina (P < .001). External testing with 100 MRI scans of lumbar spines showed almost perfect agreement for the DL model for dichotomous classification of all ROIs (κ, 0.95-0.96; P < .001). Conclusion A deep learning model showed comparable agreement with subspecialist radiologists for detection and classification of central canal and lateral recess stenosis, with slightly lower agreement for neural foraminal stenosis at lumbar spine MRI. © RSNA, 2021 Online supplemental material is available for this article. See also the editorial by Hayashi in this issue.
Keyphrases