Estimation of the proportion of treatment effect explained by a high-dimensional surrogate.
Ruixuan Rachel ZhouSihai Dave ZhaoLayla ParastPublished in: Statistics in medicine (2022)
Clinical studies examining the effectiveness of a treatment with respect to some primary outcome often require long-term follow-up of patients and/or costly or burdensome measurements of the primary outcome of interest. Identifying a surrogate marker for the primary outcome of interest may allow one to evaluate a treatment effect with less follow-up time, less cost, or less burden. While much clinical and statistical work has focused on identifying and validating surrogate markers, available approaches tend to focus on settings in which only a single surrogate marker is of interest. Limited work has been done to accommodate the high-dimensional surrogate marker setting where the number of potential surrogates is greater than the sample size. In this article, we develop methods to estimate the proportion of treatment effect explained by high-dimensional surrogates. We study the asymptotic properties of our proposed estimator, propose inference procedures, and examine finite sample performance via a simulation study. We illustrate our proposed methods using data from a randomized study comparing a novel whey-based oral nutrition supplement with a standard supplement with respect to change in body fat percentage over 12 weeks, where the surrogate markers of interest are gene expression probesets.