Login / Signup

Inferring latent heterogeneity using many feature variables supervised by survival outcome.

Beilin JiaDonglin ZengJason J Z LiaoGuanghan Frank LiuXianming M TanGuoqing DiaoJoseph G Ibrahim
Published in: Statistics in medicine (2021)
In cancer studies, it is important to understand disease heterogeneity among patients so that precision medicine can particularly target high-risk patients at the right time. Many feature variables such as demographic variables and biomarkers, combined with a patient's survival outcome, can be used to infer such latent heterogeneity. In this work, we propose a mixture model to model each patient's latent survival pattern, where the mixing probabilities for latent groups are modeled through a multinomial distribution. The Bayesian information criterion is used for selecting the number of latent groups. Furthermore, we incorporate variable selection with the adaptive lasso into inference so that only a few feature variables will be selected to characterize the latent heterogeneity. We show that our adaptive lasso estimator has oracle properties when the number of parameters diverges with the sample size. The finite sample performance is evaluated by the simulation study, and the proposed method is illustrated by two datasets.
Keyphrases
  • single cell
  • machine learning
  • deep learning
  • rna seq
  • healthcare
  • free survival
  • health information
  • case control