Latest news

2024-12-12 The PEP Repository source code is now available on GitHub under the Apache 2.0 Open Source License.

2022-10-21 The newly formed National Education Lab AI (in Dutch: Nationaal Onderwijslab AI), abbreviated to NOLAI, has committed to using PEP to safely store and exchange data for its research projects. See their website for more information on NOLAI.

2021-07-15 The PEP team is excited to announce collaboration on the OpenPlanet platform with partners imec, Wageningen University & Research, and Radboudumc. OpenPlanet is a digital ecosystem for collecting, analyzing, and sharing data on health, sustainable agriculture, and nutrition.

2020-04-15 We have provisioned a PEP testing environment to the Chronic Pain Network. We will work together with IVIDO to develop an infrastructure for privacy-friendly sharing of patient data for research purposes.

2020-02-14 All data from the Personalized Parkinson Project have been moved to Google Cloud Storage. This milestone marks Version 2 of the PEP Research data Repository. The system can now accomodate more then 1 petabyte of data.

2019-09-09 The Healthy Brain Study has started using the PEP system, which has been adapted to this project. The first participants have been included in the study - and thus in the PEP-system - today.

PEP - Responsible Data Sharing Repository

Sensitive data can be shared to processing environments responsibly, using PEP for access management, security and pseudonymisation.

Background

PEP uses Polymorphic Encryption and Pseudonymisation to protect data and the identities of subjects (study participants). PEP is a connector between a data gathering environment (where identification of subjects is required) and a data analysis environment (where identification is not allowed). The PEP repository software and services provided by the PEP team can support controllers, researchers and data analysts in setting up secure and responsible paths for data management and data sharing.

Technology Overview

The PEP Repository software implements several key functions for data sharing:

  1. The PEP Repository offers both secure storage for sensitive/personal data, and an interface for the sharing of data to secondary processing environments.

  2. Data is protected by encryption. End-to-end encryption starts in the data source environment by the PEP client. It then remains encrypted when stored/uploaded into the PEP Repository, and is finally decrypted by the data user in the secondary processing environment after downloading from the repository.

  3. PEP Data- and Access Management enables data minimisation by limiting access to the only that subset of the data that is required for the specific analysis.

  4. Separating the roles for data administration ('what data items are available to what groups of users?') and access administration ('who gets to be part of what group of users?'), a 'four eyes principle' contributes to careful data access management.

  5. Encrypted pseudonyms for data subjects stored in PEP are polymorphous, and are translated (calculated cryptographically) to local pseudonyms for each individual user. This provides personalized pseudonymisation to avoid undesired linking of minimized data subsets.

  6. The encrypted data is stored on a server that is separated from two complementary servers that contain cryptographic data required for decryption. The three servers serve to ensure that no unauthorized user can access unencrypted data can access PEP data without breaching at least two servers. For this reason, system administrators that manage a single server also don't have access to the unencrypted data stored in PEP.

  7. Stored data is versioned. Given the required permissions, this allows users to obtain datasets as they were in the past, e.g. enabling peer reviews or repeat studies.

  8. Although PEP is designed to protect the privacy of the subjects in datasets, a pathway to report accidental findings back to the data source environment is provided, without revealing identities to the secondary processing environment.

Current Applications

Currently, PEP is used in biomedical research, e.g. the Personalized Parkinson Project and the Healthy Brain Study, and is currently being implemented for use with OpenPlanet and for the sharing of AI data by the Dutch National Education Lab AI (NOLAI).

Opportunity

The application of the PEP Repository Software is not limited to biomedical research, AI or any other field. As long as security and privacy are concerned for the sharing of data for secondary use, the use of a PEP Repository should be considered.

Organisation

PEP is currently developed by Radboud University's iHub on a not-for-profit basis. The source code will be published under an open source license.