Implementing Data Governance in an Agile environment
Data Governance and Agile principles are frequently in conflict – the former interest is to protect data and its integrity, the latter is about delivering new software quickly and frequently. As we will discuss below, and in future blogs, it is possible to find a middle ground that will make each party happy (or at least not un-happy).
It’s important to realize that the approach taken is very dependent on the company size. In small companies the individual responsible for data security is most likely the Lead Developer. In mid-size companies there will typically be a DBA who controls security. In large companies there will usually be a Data Governance board.
For the small companies having the Development Lead directly involved in both development and Data Governance, while risky, is the only practical approach
For mid-size companies and large companies those two roles are distinct but overlap in terms of meeting business needs so involving either the DBA or the Data Governance board in any planning session is critical.
Senior Management expectations
For any project, Agile, Scrum, Waterfall, etc. to succeed you need support from Senior Management in your company. For the most part these expectations fall into 3 categories – is the data well understood by all the consumers, is the data consistent and of high quality, and are we complying with regulatory and compliance regulations in protecting access to the data – for example SOX and HIPAA to name just a few.
These expectations are in reality crossing two different areas – Data Governance and Data Stewardship.
Data Governance vs. Data Stewardship
These two categories are very often confused. Data Governance is generally the controlling of access to data sets in compliance with legal, regulatory, or privacy concerns while Data Stewardship is the person(s) or group(s) that create, maintain, and verify that the data is correct.
Data Governance has an important role in compliance but Data Stewardship is equally important. Without correct and consistent data you cannot derive value from analytics or business intelligence activities. Garbage in, garbage out as the saying goes.
Both parties need a seat at the table.
Why some organizations are not implementing Data Governance
Many organizations, particularly fast growing companies that work in Agile (or Scrum) methodologies, are afraid that implementing a formalized Data Governance will slow their development and time to market. While this is a valid concern it can be mitigated by involving all the responsible parties in development planning sessions. Not doing so will result in broken sprints due to mismatched expectations between the development team and the Data Governance team and an increased risk of data breaches.
What is your risk tolerance?
It’s important to recognize that the risks associated with one type of data is not the same as another type of data. For example if someone is able to view a product list with product numbers that isn’t necessarily desirable but is not catastrophic, however the same can’t be said for revealing personally identifiable customer data or ordering history.
There needs to an assessment at the start, and through all phases, of any project that clearly defines what your risk tolerance is. In the current wave of development (API’s, services, micro-services, cloud) this has become even more critical as data is being shared in multiple ways.
Mapping capabilities (people and infrastructure) to a strategic roadmap
Another important point is to recognize what resources you have available. It solves no long term strategic strategy to create a plan that you don’t have the capabilities to deliver on. Pyramid can of course help in that regard. However many companies prefer to keep the implementation of Data Governance while allowing outside consultants to audit and validate the process.
Mapping strategic roadmap to operational roadmap
A strategic roadmap is critical but it also has to be mapped to an operational roadmap to avoid the “Ivory Tower” effect. If you work in an Agile environment the development teams will typically be frustrated with a data governance model which does not reflect the reality of delivering workable code on a regular basis unless the teams with competing interests work together.
This operational roadmap should assign accountability and responsibility to the members of each team – be it Data Governance, Stewards, Compliance or developers. In larger organizations this roadmap will need to be approved by the compliance team to ensure regulatory, security and privacy concerns are being met so if you have a formal compliance team or legal department they should be involved in these planning sessions.
How do we bridge the gap?
The most important idea is to have active user participation – that means everyone who has a stake in the Data Governance, Data Stewardship or development process. Everyone must share the concerns, roles, and responsibilities associated with each phase (or Sprint) of a project at each step of the way.
With an Agile delivery model the development team needs to understand the established rules for data access, confidentiality, and privacy so they do not make inaccurate estimates. In addition the Data Governance team must also work with the development team to understand what is intended to be delivered and give the development team the information needed to deliver on these estimates.
Who is accountable?
It frequently happens, particularly in larger companies that Data Governance, Stewards, and Development teams are run by committee. This is the antithesis of an Agile process. For each area one person should take ownership of the project goals – basically a single throat to choke. Without assigning this person up front it leads to a lack of leadership and finger pointing.
Summary
Involving the Data Governance team lead (whether DBA or a single member of the control board) as well as a single Data Steward in each Sprint planning session, along with the development team can mitigate some of the issues associated with the conflicting interests of preserving data integrity while allowing for rapid development and delivery.
In future blogs we’ll discuss more in detail on how to structure the teams, what roles and responsibilities make the most sense, and when and where senior management should get involved