Speeding up interval estimation for R 2 -based mediation effect of high-dimensional mediators via cross-fitting.
Zhichao XuChunlin LiSunyi ChiTianzhong YangPeng WeiPublished in: bioRxiv : the preprint server for biology (2023)
Mediation analysis is a useful tool in biomedical research to investigate how molecular phenotypes, such as gene expression, mediate the effect of an exposure on health outcomes. However, commonly used mean-based total mediation effect measures may suffer from cancellation of component-wise mediation effects of opposite directions in the presence of high-dimensional omics mediators. To overcome this limitation, a variance-based R-squared total mediation effect measure has been recently proposed, which, nevertheless, relies on the computationally intensive nonparametric bootstrap for confidence interval estimation. In this work, we formulate a more efficient two-stage cross-fitted estimation procedure for the R-squared measure. To avoid potential bias, we perform iterative Sure Independence Screening (iSIS) in two subsamples to exclude the non-mediators, followed by ordinary least squares (OLS) regressions for the variance estimation. We then construct confidence intervals based on the newly-derived closed-form asymptotic distribution of the R-squared measure. Extensive simulation studies demonstrate that the proposed procedure is hundreds of times more computationally efficient than the resampling-based method with comparable coverage probability. Furthermore, when applied to the Framingham Heart Study, the proposed method replicated the established finding of gene expression mediating age-related variation in systolic blood pressure and discovered the role of gene expression profiles in the relationship between sex and high-density lipoprotein cholesterol. The proposed cross-fitted interval estimation procedure is implemented in R package RsqMed .
Keyphrases
- gene expression
- blood pressure
- social support
- dna methylation
- minimally invasive
- heart failure
- healthcare
- depressive symptoms
- left ventricular
- atrial fibrillation
- risk assessment
- magnetic resonance imaging
- magnetic resonance
- hypertensive patients
- single cell
- adipose tissue
- computed tomography
- health insurance
- virtual reality
- glycemic control