University College London
Browse

How safe is your open data? Balancing transparency and risk of identification

Download (467.9 kB)
poster
posted on 2024-07-11, 12:06 authored by Joseph LamJoseph Lam

This piece is a case study from participating in the Replication Games, an organised open science initiative to replicate published papers on Nature Communications.

Full transparency of data is highly regarded for open science practices. However, in the attempt of replicating studies using publicly available open data, we noticed the handling of sensitive data is often overlooked. For example, there is very few data minimisation practices, no acknowledging of the 5 safe principles, and a lack of consideration of the risks of disclosure of personal information and re-identification.

This poster presents a case study of published open data assessing how identifiable are the anonymised participants, against potential linkage attacks or other techniques. I examine the level of anonymity of a dataset using common techniques.

There is a lack of researcher and public awareness of the risks of linkage attacks, and how researchers could ethically manage their data to mitigate these risks, and communicate them proportionately with research participants..

Researchers have the responsibility to proportionately and safely share open participant data.


History