Collaborative inference for treatment effect with distributed data-sharing management in multicenter studies.
Mengtong HuShi XuPeter Xuekun SongPublished in: Statistics in medicine (2024)
Data sharing barriers present paramount challenges arising from multicenter clinical studies where multiple data sources are stored and managed in a distributed fashion at different local study sites. Merging such data sources into a common data storage for a centralized statistical analysis requires a data use agreement, which is often time-consuming. Data merging may become more burdensome when propensity score modeling is involved in the analysis because combining many confounding variables, and systematic incorporation of this additional modeling in a meta-analysis has not been thoroughly investigated in the literature. Motivated from a multicenter clinical trial of basal insulin treatment for reducing the risk of post-transplantation diabetes mellitus, we propose a new inference framework that avoids the merging of subject-level raw data from multiple sites at a centralized facility but needs only the sharing of summary statistics. Unlike the architecture of federated learning, the proposed collaborative inference does not need a center site to combine local results and thus enjoys maximal protection of data privacy and minimal sensitivity to unbalanced data distributions across data sources. We show theoretically and numerically that the new distributed inference approach has little loss of statistical power compared to the centralized method that requires merging the entire data. We present large-sample properties and algorithms for the proposed method. We illustrate its performance by simulation experiments and the motivating example on the differential average treatment effect of basal insulin to lower risk of diabetes among kidney-transplant patients compared to the standard-of-care.
Keyphrases
- electronic health record
- big data
- type diabetes
- clinical trial
- systematic review
- stem cells
- randomized controlled trial
- chronic kidney disease
- cardiovascular disease
- end stage renal disease
- blood pressure
- metabolic syndrome
- adipose tissue
- health information
- deep learning
- glycemic control
- quality improvement
- body composition
- resistance training
- cell therapy
- health insurance
- high intensity
- patient reported outcomes
- high speed
- insulin resistance
- replacement therapy
- patient reported