DataSHIELD

Transferring large datasets between sites poses many logistical, legal and ethical challenges. To overcome these, data from the EU Child Cohort Network are analysed using DataSHIELD.

DataSHIELD is an open-source technological solution to coordinate analyses of data that cannot be shared for practical or ethical reasons. It allows researchers to conduct analyses on individual-level information from multiple sources without the data, or identifying derivatives of which, ever being disclosed.Further information on DataSHIELD, including how it works, links to DataSHIELD package manuals, upcoming DataSHIELD-related events and DataSHIELD news can be found at the DataSHIELD website: http://www.datashield.ac.uk.

Other useful resources are the DataSHIELD Wiki page:
https://wikis.bris.ac.uk/display/DSDEV/Tutorial+for+DataSHIELD+users

The LifeCycle github, where users can share code:
https://github.com/lifecycle-project/analysis-tutorials/blob/master/GIT-WORKFLOW.md

For technical and security documentation:
Security architecture LifeCycle

Useful references:
  • Doiron D et al. (2017). Software Application Profile: Opal and Mica: open-source software solutions for epidemiological data management, harmonization and dissemination. Int J Epidemiol. 46(5):1372-1378
  • Budin-Ljøsne I et al. (2015). DataSHIELD: an ethically robust solution to multiple-site individual-level data analysis. Public Health Genomics. 18(2):87-96
  • Gaye A et al. (2014). DataSHIELD: taking the analysis to the data, not the data to the analysis. Int J Epidemiol. 43(6):1929-44
  • Wallace SE et al. (2014). Protecting personal data in epidemiological research: DataSHIELD and UK law. Public Health Genomics. 17(3):149-57
  • Wolfson M et al. (2010). DataSHIELD: resolving a conflict in contemporary bioscience--performing a pooled analysis of individual-level data without sharing the data. Int J Epidemiol. 39(5):1372-82