Combining cox regressions across a heterogeneous distributed research network facing small and zero counts.
Martijn J SchuemieYong ChenDavid MadiganMarc A SuchardPublished in: Statistical methods in medical research (2021)
Studies of the effects of medical interventions increasingly take place in distributed research settings using data from multiple clinical data sources including electronic health records and administrative claims. In such settings, privacy concerns typically prohibit sharing of individual patient data, and instead, cross-network analyses can only utilize summary statistics from the individual databases such as hazard ratios and standard errors. In the specific but very common context of the Cox proportional hazards model, we show that combining such per site summary statistics into a single network-wide estimate using standard meta-analysis methods leads to substantial bias when outcome counts are small. This bias derives primarily from the normal approximations of the per site likelihood that the methods utilized. Here we propose and evaluate methods that eschew normal approximations in favor of three more flexible approximations: a skew-normal, a one-dimensional grid, and a custom parametric function that mimics the behavior of the Cox likelihood function. In extensive simulation studies, we demonstrate how these approximations impact bias in the context of both fixed-effects and (Bayesian) random-effects models. We then apply these approaches to three real-world studies of the comparative safety of antidepressants, each using data from four observational health care databases.
Keyphrases
- electronic health record
- big data
- healthcare
- case control
- systematic review
- clinical decision support
- adverse drug
- emergency department
- major depressive disorder
- case report
- randomized controlled trial
- health insurance
- physical activity
- social media
- neural network
- bipolar disorder
- data analysis
- network analysis
- quality improvement