Modeling longitudinal change in biomarkers using data from a complex survey sampling design: An application to the Hispanic Community Health Study/Study of Latinos.
Nicole M ButeraDonglin ZengGerardo HeissJianwen CaiPublished in: Statistics in medicine (2023)
In observational cohort studies, there is frequently interest in modeling longitudinal change in a biomarker (ie, physiological measure indicative of metabolic dysregulation or disease; eg, blood pressure) in the absence of treatment (ie, medication), and its association with modifiable risk factors expected to affect health (eg, body mass index). However, individuals may start treatment during the study period, and consequently biomarker values observed while on treatment may be different than those that would have been observed in the absence of treatment. If treated individuals are excluded from analysis, then effect estimates may be biased if treated individuals differ systematically from untreated individuals. We addressed this concern in the setting of the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), an observational cohort study that employed a complex survey sampling design to enable inference to a finite target population. We considered biomarker values measured while on treatment to be missing data, and applied missing data methodology (inverse probability weighting (IPW) and doubly robust estimation) to this problem. The proposed methods leverage information collected between study visits on when individuals started treatment, by adapting IPW and doubly robust approaches to model the treatment mechanism using survival analysis methods. This methodology also incorporates sampling weights and uses a bootstrap approach to estimate standard errors accounting for the complex survey sampling design. We investigated variance estimation for these methods, conducted simulation studies to assess statistical performance in finite samples, and applied the methodology to model temporal change in blood pressure in HCHS/SOL.