Subtle variation in sepsis-III definitions markedly influences predictive performance within and across methods.
Samuel N CohenJames FosterPeter FosterHang LouTerry LyonsSam MorleyJames MorrillHao NiEdward PalmerBo WangYue WuLingyi YangWeixin YangPublished in: Scientific reports (2024)
Early detection of sepsis is key to ensure timely clinical intervention. Since very few end-to-end pipelines are publicly available, fair comparisons between methodologies are difficult if not impossible. Progress is further limited by discrepancies in the reconstruction of sepsis onset time. This retrospective cohort study highlights the variation in performance of predictive models under three subtly different interpretations of sepsis onset from the sepsis-III definition and compares this against inter-model differences. The models are chosen to cover tree-based, deep learning, and survival analysis methods. Using the MIMIC-III database, between 867 and 2178 intensive care unit admissions with sepsis were identified, depending on the onset definition. We show that model performance can be more sensitive to differences in the definition of sepsis onset than to the model itself. Given a fixed sepsis definition, the best performing method had a gain of 1-5% in the area under the receiver operating characteristic (AUROC). However, the choice of onset time can cause a greater effect, with variation of 0-6% in AUROC. We illustrate that misleading conclusions can be drawn if models are compared without consideration of the sepsis definition used which emphasizes the need for a standardized definition for sepsis onset.