The paradox of data

Anushree N

Published Apr 11, 2023

Most organizations today understand the value of data and how it can add value to every facet of the organization. It is a resource that can add value to the process that produces the data as well as other related areas.

Data, unlike most resources, does not diminish in value upon repeated use. In fact, the same data can be used in multiple instances. The better data an enterprise creates the more value it can extract with visibility, real-time analytics, and more accurate training data for better AI and ML. However, the value that can be extracted from data depends on how well it is shared internally while abiding by legal laws.

But like any resource, data risks must be managed, particularly in regulated industries. Controls help to mitigate such risks, so organizations that have strong controls around their data are exposed to less risk than those that don’t.

This presents a paradox: data that is permitted to be freely shareable across the enterprise has the potential to add tremendous value for stakeholders, but the more freely shareable the data is, the greater the possible risk to the organization. To unlock the value of data, we must solve this paradox. We must make data easy to share across the organization while maintaining appropriate control over it.

One way to address this paradox is to take a two-pronged approach. Firstly, by defining ‘Data Products’, which are designed by people who understand the data and how to manage its permissible uses, and limitations. And secondly, by implementing a ‘Data Mesh’ architecture, which allows us to align our data technology to those data products.

This combined approach:

Empowers data product owners to make management and use decisions for their data
Enforces those decisions by sharing data, rather than copying it
Provides clear visibility of where data is being shared across the enterprise

Aligning our Data Architecture to our Data Product Strategy

Data products are groups of related data from the systems that support business operations. They are broad but cohesive collections of related data. We store the data for each data product in its own product-specific data lake. If each lake has its own cloud-based storage layer, that would be ideal.

For example – Sales and marketing data will be related to each other – so this can be the base for one kind of data lake.

The services that consume data are hosted in consumer application domains. These consumer applications are physically separated both from each other and from the data lakes. When a data consumer needs data from one or more of the data lakes, we use cloud services to make the lake data visible to the data consumers, and provide other cloud services to query the data directly from the lakes. The data product-specific lakes that hold data, and the application domains that consume lake data, are interconnected to form the data mesh.

Empower the right people to make control decisions

A data mesh architecture allows each data product lake to be managed by a team of data product owners who understand the data in their domain, and who can make risk-based decisions regarding the management of their data.

Enforce control decisions through in-place consumption

The data mesh allows us to share data from the product lakes, rather than copying it to the consumer applications that will use it. In addition to keeping the storage bill down, sharing minimizes discrepancies in the data between the system that produced the data and the system that consumes it. That helps to ensure that the data being consumed for analytics, AI/ML, and reporting is up-to-date and accurate.

Provide cross-enterprise visibility of data consumption

Historically, data exchanges between systems were either system-to-system or via message queues. Since there was no central, automated repository of all data flows, data product owners couldn’t easily see when their data was flowing between systems. A good data mesh architecture addresses the visibility challenge by using a cloud-based Mesh Catalog to facilitate data visibility between the lakes and the data consumers. One could use AWS Glue Catalog or a similar cloud-based data cataloging service to enable this.

This catalog does not hold any data, but it does have visibility of what lakes are sharing data with which data consumers. This offers a single point of visibility into the data flows across the enterprise, and gives the data product owners confidence that they know where their data is being used.

To Conclude, Data Mesh in Action

Here’s an example to illustrate how the Data Mesh architecture will enable our business.

If a team was producing firmwide reports they would extract and join data from multiple systems in multiple data domains to produce reports.

Through the Data Mesh architecture, the data product owners for those data domains will make their data available in lakes. The enterprise data catalog will allow reporting teams to find and request the lake-based data to be made available in their reporting application. The mesh catalog will allow auditing the data flows from the lakes to the reporting application, so it’s clear where the data in the reports originates.

One development that will hugely boost this space will be blockchain technology. It will allow much easier and safer storage of data which allows the data or “ the new oil” to really start firing up the engines of progress.

To view or add a comment, sign in

See all

The paradox of data

Anushree N

Aligning our Data Architecture to our Data Product Strategy

Empower the right people to make control decisions

Recommended by LinkedIn

Enforce control decisions through in-place consumption

Provide cross-enterprise visibility of data consumption

To Conclude, Data Mesh in Action

More articles by this author

Others also viewed

What Your Growing Business Needs to Know About Data Mesh

From Chaos to Confidence: Why JetBlue’s Data Strategy is a Blueprint for Trust

An interesting paper, "Who Owns Data in the Enterprise?"

Empowering Your Business with a Modern Data Strategy: Leveraging Data as the Key to Growth and Innovation

D3Clarity Joins Precisely Partner Program, Combining Data Intelligence with Market-Leading Data Integrity Expertise

Data and Analytics Essentials: How D&A Leaders Can Accelerate Data Mesh Adoption

Why should we care about Data Mesh?

Data Contracts: The Backbone of Reliable Data Products

Are you a "Data Driven" Organization?

Explore content categories

Aligning our Data Architecture to our Data Product Strategy

Empower the right people to make control decisions

Recommended by LinkedIn

Enforce control decisions through in-place consumption

Provide cross-enterprise visibility of data consumption

To Conclude, Data Mesh in Action

Impact of Covid on credit risk management.

Apr 30, 2023

INFLATION!!

Apr 29, 2023

Influence of Fintech on traditional banking

Apr 28, 2023

How Deep Could the Next US Recession Be?

Apr 27, 2023

How to become a Product Manager?

Apr 26, 2023

Green Hydrogen and the future

Apr 25, 2023

What is the potential for institutional adoption of crypto currencies?

Apr 24, 2023

Driving AI ethics for a more just world done – check if something can be added in the end

Apr 23, 2023

Big data – is it right for you?

Apr 22, 2023

Critical data investors maybe ignoring

Apr 21, 2023

Others also viewed

What Your Growing Business Needs to Know About Data Mesh

From Chaos to Confidence: Why JetBlue’s Data Strategy is a Blueprint for Trust

An interesting paper, "Who Owns Data in the Enterprise?"

Empowering Your Business with a Modern Data Strategy: Leveraging Data as the Key to Growth and Innovation

D3Clarity Joins Precisely Partner Program, Combining Data Intelligence with Market-Leading Data Integrity Expertise

Data and Analytics Essentials: How D&A Leaders Can Accelerate Data Mesh Adoption

Why should we care about Data Mesh?

Data Contracts: The Backbone of Reliable Data Products

Are you a "Data Driven" Organization?

Similar topics

How to Unlock Data Potential in Enterprises

How Data Sharing is Transforming Financial Services

Data Privacy Considerations for Enterprise Architecture Roadmaps

Explore content categories