Login / Signup

Variable selection in semiparametric regression models for longitudinal data with informative observation times.

Omidali Aghababaei JaziEleanor M Pullenayegum
Published in: Statistics in medicine (2022)
A common issue in longitudinal studies is that subjects' visits are irregular and may depend on observed outcome values which is known as longitudinal data with informative observation times (follow-up). Semiparametric regression modeling for this type of data has received much attention as it provides more flexibility in studying the association between regression factors and a longitudinal outcome. An important problem here is how to select relevant variables and estimate their coefficients in semiparametric regression models when the number of covariates at baseline is large. The current penalization procedures in semiparametric regression models for longitudinal data do not account for informative observation times. We propose a variable selection procedure that is suitable for the estimation methods based on pseudo-score functions. We investigate the asymptotic properties of penalized estimators and conduct simulation studies to illustrate the theoretical results. We also use the procedure for variable selection in semiparametric regression models for the STAR*D dataset from a multistage randomized clinical trial for treating major depressive disorder.
Keyphrases
  • major depressive disorder
  • electronic health record
  • big data
  • cross sectional
  • bipolar disorder
  • minimally invasive
  • working memory
  • data analysis
  • deep learning
  • double blind