Survival prediction and treatment optimization of multiple myeloma patients using machine-learning models based on clinical and gene expression data.
Adrián Mosquera OrgueiraMarta Sonia González PérezJosé Ángel Díaz AriasBeatriz Antelo RodríguezNatalia Alonso VenceÁngeles Bendaña LópezAitor Abuín BlancoLaura Bao PérezAndrés Peleteiro RaíndoMiguel Cid LópezManuel Mateo Pérez EncinasJosé Luis Bello LópezMaria-Victoria Mateos-MantecaPublished in: Leukemia (2021)
Multiple myeloma (MM) remains mostly an incurable disease with a heterogeneous clinical evolution. Despite the availability of several prognostic scores, substantial room for improvement still exists. Promising results have been obtained by integrating clinical and biochemical data with gene expression profiling (GEP). In this report, we applied machine learning algorithms to MM clinical and RNAseq data collected by the CoMMpass consortium. We created a 50-variable random forests model (IAC-50) that could predict overall survival with high concordance between both training and validation sets (c-indexes, 0.818 and 0.780). This model included the following covariates: patient age, ISS stage, serum B2-microglobulin, first-line treatment, and the expression of 46 genes. Survival predictions for each patient considering the first line of treatment evidenced that those individuals treated with the best-predicted drug combination were significantly less likely to die than patients treated with other schemes. This was particularly important among patients treated with a triplet combination including bortezomib, an immunomodulatory drug (ImiD), and dexamethasone. Finally, the model showed a trend to retain its predictive value in patients with high-risk cytogenetics. In conclusion, we report a predictive model for MM survival based on the integration of clinical, biochemical, and gene expression data with machine learning tools.