Login / Signup

Privacy-Preserving Workflow for the Cross-Border Federated Analysis of Clinical Data.

Miroslav PuskaricBalasubramanian ChandramouliThomas OsmoRoy GusinowChiara DellacasaElisa RossiSalvatore CataudellaAnna GórskaEugenia Rinaldi
Published in: Studies in health technology and informatics (2024)
The motivation behind this research is to perform a privacy-preserving analysis of data located at remote sites and in different jurisdictions with no possibility of sharing individual-level information. Here, we present key findings from requirements analysis and a resulting federated data analysis workflow built using open-source research software, where patient-level information is securely stored and never exposed during the analysis process. We present additional improvements to further strengthen the security of the workflow. We emphasize and showcase the use of data harmonization in the analysis. The data analysis is done using the R language for statistical computing and DataSHIELD libraries for non-disclosive analysis of sensitive data. The workflow was validated against two data analysis scenarios, confirming the results obtained with a centralized analysis approach. The clinical datasets are part of the large Pan-European SARS-Cov-2 cohort, collected and managed by the ORCHESTRA project. We demonstrate the viability of establishing a cross-border federated data analysis framework and conducting an analysis without exposing patient-level information, achieving results equivalent to centralized non-secure analysis. However, it is vital to ensure requirements associated with data harmonization, anonymization and IT infrastructure to maintain availability, usability and data security.
Keyphrases
  • data analysis
  • electronic health record
  • big data
  • health information
  • sars cov
  • public health
  • machine learning
  • climate change
  • single cell
  • respiratory syndrome coronavirus