Login / Signup

Generalized additive models to analyze nonlinear trends in biomedical longitudinal data using R: Beyond repeated measures ANOVA and linear mixed models.

Ariel I MundoJohn R TiptonTimothy J Muldoon
Published in: Statistics in medicine (2022)
In biomedical research, the outcome of longitudinal studies has been traditionally analyzed using the repeated measures analysis of variance (rm-ANOVA) or more recently, linear mixed models (LMEMs). Although LMEMs are less restrictive than rm-ANOVA as they can work with unbalanced data and non-constant correlation between observations, both methodologies assume a linear trend in the measured response. It is common in biomedical research that the true trend response is nonlinear and in these cases the linearity assumption of rm-ANOVA and LMEMs can lead to biased estimates and unreliable inference. In contrast, GAMs relax the linearity assumption of rm-ANOVA and LMEMs and allow the data to determine the fit of the model while also permitting incomplete observations and different correlation structures. Therefore, GAMs present an excellent choice to analyze longitudinal data with non-linear trends in the context of biomedical research. This paper summarizes the limitations of rm-ANOVA and LMEMs and uses simulated data to visually show how both methods produce biased estimates when used on data with non-linear trends. We present the basic theory of GAMs and using reported trends of oxygen saturation in tumors, we simulate example longitudinal data (2 treatment groups, 10 subjects per group, 5 repeated measures for each group) to demonstrate their implementation in R. We also show that GAMs are able to produce estimates with non-linear trends even when incomplete observations exist (with 40% of the simulated observations missing). To make this work reproducible, the code and data used in this paper are available at: https://github.com/aimundo/GAMs-biomedical-research.
Keyphrases
  • electronic health record
  • big data
  • magnetic resonance imaging
  • primary care
  • machine learning
  • data analysis
  • artificial intelligence
  • mass spectrometry
  • single cell
  • decision making
  • case control