Stabl: sparse and reliable biomarker discovery in predictive modeling of high-dimensional omic data.
Julien HedouIvana MariéGrégoire BellanJakob EinhausDyani K GaudillièreFrancois-Xavier LadantFranck VerdonkIna Annelies StelzerDorien FeyaertsAmy S TsaiEdward GanioMaximilian SabayevJoshua GillardAdam BonhamMasaki SatoMaïgane DiopMartin S AngstDavid StevensonNima AghaeepourAndrea MontanariBrice L GaudillierePublished in: Research square (2023)
High-content omic technologies coupled with sparsity-promoting regularization methods (SRM) have transformed the biomarker discovery process. However, the translation of computational results into a clinical use-case scenario remains challenging. A rate-limiting step is the rigorous selection of reliable biomarker candidates among a host of biological features included in multivariate models. We propose Stabl, a machine learning framework that unifies the biomarker discovery process with multivariate predictive modeling of clinical outcomes by selecting a sparse and reliable set of biomarkers. Evaluation of Stabl on synthetic datasets and four independent clinical studies demonstrates improved biomarker sparsity and reliability compared to commonly used SRMs at similar predictive performance. Stabl readily extends to double- and triple-omics integration tasks and identifies a sparser and more reliable set of biomarkers than those selected by state-of-the-art early- and late-fusion SRMs, thereby facilitating the biological interpretation and clinical translation of complex multi-omic predictive models. The complete package for Stabl is available online at https://github.com/gregbellan/Stabl.