Pseudonymization, Anonymization & GDPR

Brad Perry

Published Nov 13, 2018

The General Data Protection Regulation (GDPR) is a set of laws designed to protect individuals within the European Union (EU). Specifically, it gives individuals more control over their personal data and how it is being used. It provides a uniform data security framework for all EU members, so that each member state no longer needs to create its own data protection laws. Now, companies have a legal incentive to protect and keep private any data they collect from their users.

Recital 26 of the GDPR states:

“The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.”

Pseudonymization and anonymization are methods that are highly recommended by the GDPR regulation because they reduce risk and assist “data processors” in complying with their data protection obligations. These techniques can be used to protect individuals by masking data (such as a name or date of birth or address) that would enable someone to link the data to them. To gain a high level understanding of how the methods affect personal data, think of your data as a range with completely unprotected, fully identifiable personal data on one end and anonymized data with zero identifiable information in it and pseudonymized information somewhere in between. The Future of Privacy Forum put together a nice visual guide to illustrate the concepts:

https://fpf.org/2016/04/25/a-visual-guide-to-practical-data-de-identification/

What is anonymization?

Anonymization is the permanent removal of any information that may serve as an identifier. Once a data set has been anonymized, it is impossible to identify individuals from it. Anonymizing data allows organizations to use the data for marketing and research, while protecting individuals from data exposure. However, since true anonymization is difficult to achieve, most businesses choose to use pseudonymization techniques.

In 2014, the Article 29 Working Party (WP) issued Opinion 05/2014 on Anonymisation Techniques, in which they analyze the effectiveness of various techniques, including:

Noise Addition – adding a level of imprecision to the original data. For example, a patient’s weight might show a range of +/- 10 lbs., rather than a precise number.
Substitution/Permutation – replacing information with other values. For example a patient’s height of 5’11” might be stored as “blue.”
Differential Privacy – the idea of converting individual user data into something unidentifiable by bundling and blurring it in one way or another
Aggregation/K-Anonymity – a “hiding in the crowd” concept where if each individual is part of a larger group, then any of the records in the group could correspond to a single person. For example a data set might contain information about people in the North West instead of specifying a specific town, like Seattle, WA.

When done properly, anonymization can place data outside the scope of the GDPR.

What is pseudonymization?

Pseudonymization involves replacing actual data with pseudonyms. Article 4(5) of the GDPR defines pseudonymization as:

“…the processing of personal data in such a way that the data can no longer be attributed to a specific data subject without the use of additional information.”

Like anonymization, there are various techniques to pseudonymize data, including:

Scrambling – the mixing of obfuscation of letters. For example the word “obfuscate” could become “tocbusafe”.
Encryption – the process of converting the information into a code that is unintelligible. In most cases, encrypted data can be decrypted by use of an encryption key.
Tokenization – replacing sensitive parts of the data with non-sensitive placeholder values. For example, a credit card number “4111 111 111 1234” can become “4281 **** **** 2819".
Data blurring – using an approximation of values to render the data meaningless. Think of a portrait with a blurred face – you know there is a person represented there, but you cannot identify who it is.

Data handlers store personal data separate from “additional data” that serves to link the two together. By utilizing pseudonymization techniques, data controllers can still benefit from the data’s utility, while protecting individuals’ rights.

That said, it is important to note that pseudonymized data falls under the scope of GDPR. According to Article 29 Working Party Opinion 05/2014 on Anonymisation Techniques (1):

“Pseudonymised data cannot be equated to anonymised information as they continue to allow an individual data subject to be singled out and linkable across different data sets”.

In summary

The effectiveness of both anonymization and pseudonymization depend on the business case and individual circumstances. Both techniques are recommended by the GDPR to enable compliance with the laws designed to protect personal data.

It is interesting to note that while the citizens of the EU have this legal protection of their personal data, the US has yet to address the issue. Personal data protection is very much on the minds of business and government leaders as well as citizens in this country.

Timothy J. Morris 7y

Nice article and great infographic. From a programmatic standpoint, I've always thought it was fairly straight-forward to scrub, anonymize, redact, and de-identify. The big trick is to make that data meaningful as aggregate cohorts (in health care) and as well as being useful in non-production environments, i.e. it's not helpful to change my name to "ASDF". I found it particularly challenging passing the test to avoid the likelihood of data RE-IDENTIFICATION (HIPAA rule) https://en.wikipedia.org/wiki/Data_Re-Identification

2 Reactions

To view or add a comment, sign in

Pseudonymization, Anonymization & GDPR

Brad Perry

What is anonymization?

What is pseudonymization?

In summary

More articles by Brad Perry

Others also viewed

SRB Pseudonymization Data Case Withdrawn: What It Means for GDPR Compliance?

Duties and Obligations of a Data Controller under EU GDPR

Understanding the new General Data Protection Regulation (GDPR)

The GDPR @One

COALESCING DATA PRIVACY WITH ORGANISATIONAL STRATAGEM

Introduction to the General Data Protection Regulation (GDPR)

How NOT to ask for GDPR permission to email someone!

General Data Protection Regulation (GDPR)

GDPR: Pseudonymisation vs Anonymisation

GDPR and European Data Privacy: What you Need to Know Now

User Data Anonymization Techniques

Data Anonymization in Healthcare

Guide to Global Data Privacy Compliance

Understanding EU Data Protection Law for Professionals

Explore content categories

What is anonymization?

What is pseudonymization?

In summary

More articles by Brad Perry

Finding Purpose

Re-wire your brain for more happiness

What Is Leadership?

What is DeNIST’ing and Why Should You Care?

The Basic Steps Of eDiscovery

What is eDiscovery? And why should you care?

FOUR STEPS TO ENSURE THAT WHAT IS PRIVATE, REMAINS PRIVATE

Considering Time Zones when Processing ESI

Others also viewed

SRB Pseudonymization Data Case Withdrawn: What It Means for GDPR Compliance?

Duties and Obligations of a Data Controller under EU GDPR

Understanding the new General Data Protection Regulation (GDPR)

The GDPR @One

COALESCING DATA PRIVACY WITH ORGANISATIONAL STRATAGEM

Introduction to the General Data Protection Regulation (GDPR)

How NOT to ask for GDPR permission to email someone!

General Data Protection Regulation (GDPR)

GDPR: Pseudonymisation vs Anonymisation

GDPR and European Data Privacy: What you Need to Know Now

Similar topics

User Data Anonymization Techniques

Data Anonymization in Healthcare

Guide to Global Data Privacy Compliance

Understanding EU Data Protection Law for Professionals

Explore content categories