Performance of binary prediction models in high-correlation low-dimensional settings: a comparison of methods.
Artuur M LeeuwenbergMaarten van SmedenJohannes A LangendijkArjen van der SchaafMurielle E MauerKarel G M MoonsJohannes B ReitsmaEwoud SchuitPublished in: Diagnostic and prognostic research (2022)
Based on the results, we would recommend refraining from data-driven predictor selection approaches in the presence of high collinearity, because of the increased instability of predictor selection, even in relatively high events-per-variable settings. The selection of certain predictors over others may disproportionally give the impression that included predictors have a stronger association with the outcome than excluded predictors.
Keyphrases