Towards improved identification of vertebral fractures in routine CT scans: development and external validation of a machine learning algorithm.

Joeri Nicolaes Michael Kriegbaum SkjødtSteven RaeymaeckersChristopher Dyer SmithBo AbrahamsenThomas FuerstMarc DeboisDirk VandermeulenCesar Libanati

Published in: Journal of bone and mineral research : the official journal of the American Society for Bone and Mineral Research (2023)

Vertebral fractures (VFs) are the hallmark of osteoporosis, being one of the most frequent types of fragility fracture and an early sign of the disease. They are associated with significant morbidity and mortality. VFs are incidentally found in one out of five imaging studies, however, more than half of the VFs are not identified nor reported in patient CT scans. Our study aimed to develop a machine learning algorithm to identify VFs in abdominal/chest CT scans and evaluate its performance. We acquired two independent data sets of routine abdominal/chest CT scans of patients aged 50 years or older: a training set of 1,011 scans from a non-interventional, prospective proof-of-concept study at the Universitair Ziekenhuis (UZ) Brussel and a validation set of 2,000 subjects from an observational cohort study at the Hospital of Holbaek. Both data sets were externally reevaluated to identify reference standard VF readings using the Genant semiquantitative (SQ) grading. Four independent models have been trained in a cross-validation experiment using the training set and an ensemble of four models has been applied to the external validation set. The validation set contained 15.3% scans with ≥1 VF (SQ2-3), while 663 out of 24,930 evaluable vertebrae (2.7%) were fractured (SQ2-3) as per reference standard readings. Comparison of the ensemble model with the reference standard readings in identifying subjects with one or more moderate or severe VF resulted in an AUROC of 0.88 (95% CI 0.85-0.90), accuracy of 0.92 (95% CI 0.91-0.93), kappa of 0.72 (95% CI 0.67-0.76), sensitivity of 0.81 (95% CI 0.76-0.85), and specificity of 0.95 (95% CI 0.93-0.96). We demonstrated that a machine learning algorithm trained for VF detection achieved strong performance on an external validation set. It has the potential to support healthcare professionals with the early identification of vertebral fractures and prevention of future fragility fractures.

Keyphrases