Bayesian semiparametric joint modeling of a count outcome and inconveniently timed longitudinal predictors.
Woobeen LimMichael L PennellMichelle J NaughtonElectra D PaskettPublished in: Statistical methods in medical research (2023)
The Women's Health Initiative (WHI) Life and Longevity After Cancer (LILAC) study is an excellent resource for studying the quality of life following breast cancer treatment. At study entry, women were asked about new symptoms that appeared following their initial cancer treatment. In this article, we were interested in using regression modeling to estimate associations of clinical and lifestyle factors at cancer diagnosis (independent variables) with the number of new symptoms (dependent variable). Although clinical and lifestyle data were collected longitudinally, few measurements were obtained at diagnosis or at a consistent timepoint prior to diagnosis, which complicates the analysis. Furthermore, parametric count models, such as the Poisson and negative binomial, do not fit the symptom data well. Thus, motivated by the issues encountered in LILAC, we propose two Bayesian joint models for longitudinal data and a count outcome. Our two models differ according to the assumption on the outcome distribution: one uses a negative binomial (NB) distribution and the other a nonparametric rounded mixture of Gaussians (RMG). The mean of each count distribution is dependent on imputed values of continuous, binary, and ordinal variables at a time point of interest (e.g. diagnosis). To facilitate imputation, longitudinal variables are modeled jointly using a linear mixed model for a latent underlying normal random variable, and a Dirichlet process prior is assigned to the random subject-specific effects to relax distribution assumptions. In simulation studies, the RMG joint model exhibited superior power and predictive accuracy over the NB model when the data were not NB. The RMG joint model also outperformed an RMG model containing predictors imputed using the last value carried forward, which generated estimates that were biased toward the null. We used our models to examine the relationship between sleep health at diagnosis and the number of new symptoms following breast cancer treatment in LILAC.
Keyphrases
- electronic health record
- healthcare
- public health
- big data
- physical activity
- metabolic syndrome
- peripheral blood
- cross sectional
- cardiovascular disease
- papillary thyroid
- mental health
- polycystic ovary syndrome
- sleep quality
- weight loss
- pregnant women
- depressive symptoms
- artificial intelligence
- insulin resistance
- lymph node metastasis
- health information
- risk assessment
- data analysis
- social media
- deep learning
- squamous cell
- cervical cancer screening
- quality improvement
- virtual reality