GDPR and Metadata management.

GDPR and Metadata management.

At the time of this writing (July 2017) there still are enterprises that have not embarked on their journey towards GDPR compliancy. Other enterprises may have started a ‘GDPR awareness’-phase, in which the enterprise starts realising what the impact of the regulation on day to day business will be and some enterprises know what the impact on their 3P’s (people, process, product) will be, but do not know where to start in becoming compliant. This article is written with those enterprises in mind and is meant to be a lending hand in making a jumpstart in becoming GDPR compliant before the 25th of May 2018.

Metadata management

I strongly believe that part of becoming and staying GDPR compliant has to do with adequate Metadata management. In the latest DAMA-DMBOK2 Data Management Framework (pages 417 and 418 to be exact), DAMA international (see DAMA website) introduces Metadata as:

 “The most common definition of Metadata, ‘data about data’ is misleadingly simple. The kind of information that can be classified as Metadata is wide-ranging. Metadata includes information about technical and business processes, data rules and constraints, and logical and physical data structures. It describes the data itself (e.g., databases, data elements, data models), the concepts the data represents (e.g., business processes, application systems, software code, technology infrastructure), and the connections (relationships) between the data and concepts. Metadata helps an organization understand its data, its systems and its workflows. It enables data quality assessment and is integral to the management of databases and other applications. It contributes to the ability to process, maintain, integrate, secure, audit and govern other data.”

… and a bit further in this introduction it says …

“Without reliable Metadata, an organization does not know what data it has, what the data represents, where it originates, how it moves through systems, who has access to it, or what it means for the data to be high quality. Without Metadata, an organization cannot manage its data as an asset. Indeed, without Metadata, an organization may not be able to manage its data at all.”

Since the GDPR is all about knowing where your privacy sensitive data comes from, where that data resides within your systems, who is using them and how they are being used, proper data management, especially Metadata management is key towards compliancy. 

GDPR activities and Metadata

Every EU member state’s data protection supervisory authority (SA) has a list of activities to execute to become GDPR compliant that is similar or equal to the following list:

Activity 1: Awareness

Make sure that the relevant people in your organizations (such as policy makers) are aware of the new privacy rules. They must estimate what the impact of the GDPR is on your current processes, services and goods and what adjustments are needed to be GDPR compliant. 

From a Metadata perspective, pay special attention to what data are used, how they are used, by whom they are used and what they are used for in these processes, services and goods. 

Activity 2: Rights of data subjects

A data subject is an individual who is the subject of personal data. Under the GDPR, data subjects get more and improved privacy rights. Make sure that they are able to exercise their privacy rights. Consider existing rights, such as the right to access and the right to correction and removal. But keep in mind new rights, such as the right to data portability. With this right, you need to ensure that interested parties can easily get their data and then pass them on to another organisation if they want. Also, data subjects can file complaints with the SA about how you handle their data. The SA is obligated to handle these complaints.

To ensure a prompt and adequate reaction on a data subject executing his/her right to access data, the right to correction of data, the right to removal of data as well as the right to data portability and the right to gain insights in how an enterprise handles data, proper Metadata management procedures need to be in place.

Activity 3: Insights in data processing

Document what personal data you process and for what purpose you do this, where this data comes from and with whom you share them. Under the GDPR you have a documentation obligation, which means that you must be able to demonstrate that your organization is in compliance with the GDPR. You will need this documentation as individuals engage in their privacy. If they ask you to correct or delete their data, you must pass it on to the organizations with which you shared their data. Also document on which legal basis you process these data. For example, do you claim a legitimate interest or do you request permission (consent) for specific use of the data from the parties involved?

This documentation to give insights in data processing needs to be stored in a Metadata repository. Easily accessible and immediately ready to be used when needed. Supporting fast insights in which processes, which data are used, by whom and how.

Activity 4: Privacy impact assessment (PIA)

Under the GDPR you may be required to perform a privacy impact assessment. This is an instrument for mapping out the privacy risks of data processing and then take measures to reduce the risks. You must perform a PIA if your intended data processing is likely to entail a high privacy risk.

Part of metadata management is identifying what data are privacy sensitive. Adequate execution of a PIA will greatly benefit from good metadata management. 

Activity 5: Privacy by design & privacy by default

Familiarize your organization with the GDPR mandatory principles of privacy by design and privacy by default, and see how to apply these principles within your organization. Privacy by design means that you are already protecting personal data while designing products and services. Privacy by default means that you must take technical and organizational measures to ensure that you, by default, process only personal information necessary for a very specific purpose in line with the overall purpose of your business.

Metadata can support enforcing technical measures to ensure processing of personal information is done for a very specific purpose in line with the overall purpose of an enterprise. 

Activity 6: Data Protection Officer

Under the GDPR, organisations may be required to appoint a Data Protection Officer (DPO). Determine if this applies to your organisation. Of course, your organisation may also voluntarily appoint a DPO. 

The DPO will be one of the key persons driving data management within an enterprise. The Metadata repository will be his/her biggest source of information when deciding on measures to be taken to protect privacy sensitive data.

Activity 7: Personal data breach notification

In the event of a personal data breach, data controllers must notify the SA which is most likely the SA of the member state where the controller has its main establishment or only establishment, although this is not entirely clear. Notice must be provided “without undue delay and, where feasible, not later than 72 hours after having become aware of it.” If notification is not made within 72 hours, the controller must provide a “reasoned justification” for the delay.

Metadata, for instance the creation date of a file or the name of the database hacked, supports in establishing if, when and how a data breach has happened.

Activity 8: Data processing contracts

Did you outsource your data processing to an editor (referred to in the GDPR as the "processor")? Then evaluate whether the agreed measures in existing contracts with your processors are still sufficient and meet the requirements of the GDPR. If not, please make timely changes.

Activity 9: Leading supervisor

Does your organisation have offices in several EU member states? Or does your data processing affect multiple member states? Then you only have to do business with one SA under the GDPR. This is called the leading supervisor. If this applies to your organisation, please determine which SA this may be.

Activity 10: Consent management

Your data processing may be depending on the consent of the parties involved. The GDPR proposes stricter requirements for giving consent. Therefore, evaluate, register and administer (the way) consent was or was not given. If necessary change this process to comply to the GDPR. You must be able to prove that you have received valid consent from people to process their personal data. It must be as easy for data subject to withdraw their consent as it is to give consent.

In a Metadata repository, you can register and administer if consent was or was not given to use privacy sensitive data for specific purposes. Mind you a Metadata repository is not meant to keep track of how and by whom this consent was given, but if you want to do that and the repository can handle it, you certainly can.

Now where to start?

8 out of 10 of the GDPR activities mentioned are in greater or lesser degree supported by Metadata and its Metadata management process. Therefore, collecting Metadata and setting up a good Metadata management process seems a good starting point in the road towards becoming GDPR compliant. 

To be able to collect Metadata you need a Metadata repository. There are many ways to setting up a Metadata repository, but please read this extensive Top 10 Mistakes to Avoid When Developing a Metadata Repository article written by David Marco in 2000. And yes, even in 2017 this article is still as valid as ever.

When developing the Metadata repository and its accompanying management processes for the special purpose of GDPR compliancy, first focus on one of your systems that deals with privacy sensitive data. Preferably a system supporting secondary business processes, like a data warehouse used for business intelligence reporting or a system that is not heavily used and can handle the additional load of harvesting Metadata without disrupting day to day business too much. If a system has been chosen, then pick a few, say 5, attributes within that system that hold privacy sensitive data and start registering all Metadata for those attributes that you think are needed to become GDPR compliant (e.g. where the data comes from, where it goes to, how many times it is changed, by whom or what programs is this changed, why it is entered, etc.).

Even for these few attributes, this already seems to be a hell of a rather repetitive job. Of course, there is tooling available that can help you with setting up this Metadata repository and automate a lot of the work that needs to be done. After these attributes have been administered completely, pick new attributes to work with. Slowly you will notice that you start crawling through your system and later through more or even all your systems by following lineage and impact of data. An attribute may be comprised of more than one other attribute and this attribute could come from another system.

While registering and administering all of this Metadata you will notice the need for standards, templates and principles for Metadata management. Investigate on best-practices in this. Be sure that you will encounter every little data management problem there is, that is why all of this cannot be done without guidance of data management professionals. We have a lot of good ones here at Teradata (sorry for this little commercial).

What works for you?

As I said earlier in this article, I strongly believe that part of becoming and staying GDPR compliant has to do with adequate Metadata management. But of course, there is much more to this than just Metadata management. By writing this article I wanted to hand enterprises that do not know where to start in becoming GDPR compliant a starting point. Please let me know if it worked for you or if it did not work for you. And if it did not work for you, let me know what your organisation did to start becoming GDPR compliant.

Hi Robbert, ever looked at IDaaS? iwelcome has a metadata directory in place especially build for GDPR. Customers not wanting to build the directory themselves can be enabled from the cloud.

I am still doubting if an organisation is better off with a decoupled metadata repository or that systems need to be reengineered to display and manage metadata better with some visualisation across systems accessing metadata in the system. Both are a monumental tasks. I'm going back and forth. For me the impact of what happens if the data an the metadata are getting out of sync is the reason for this. Metadata, or context setting information as Barry Devlin calls it, which I like better as a descriptive term of what it is, is an intrinsic part of data in itself. Thanks for the sensible overview, most valuable.

To view or add a comment, sign in

Others also viewed

Explore content categories