Data-Centered Architecture - First Thoughts
The State Library of Victoria, Melbourne, Australia by andrew-t8 on Pixabay

Data-Centered Architecture - First Thoughts

The recently-founded Data-Centered-Architecture LinkedIn Group has been thinking about what #datacenteredarchitecture is, and why it is important. This article – not necessarily representing the views of all group members – shares some of the ideas. If you agree or disagree, please comment! If you want to pursue the topic further, join the group!

What Is Data-Centered Architecture?

Data-centered architecture is an architecture style in which the data is designed first and applications are then designed to create and use it. The architecture focuses on the flow of information through the organization and then adjusts the processes to streamline the flow.

The approach requires a thorough understanding of the data: where it comes from, who owns it, what is the master and what is a copy, who uses it and how, how long it has to be retained, when it has to be archived, how confidential it is, and so on.

There are other aspects that might be regarded as central. User behavior is certainly one. Process-centered would be the traditional opposite to data-centered. Service-oriented puts services at the center. Particular technologies such as blockchain could claim a central role. Putting data at the center is a choice that the architect makes.

There are very good reasons for putting data in the center, rather than user behavior, services, processes, blockchain or whatever. Data represents information, which is the lifeblood of any organization. Think about it. What do we backup to ensure business continuity? What do we encrypt to prevent unauthorized usage? Whether you like it or not, it is data that holds significant value and needs to be protected. If it is that valuable, we had better give it a central role.

Enterprise Data

Data is valuable to an enterprise because it represents information about the enterprise’s assets, inventions, financial position, and so on. “Data” does not just mean fields in a table. It can take many forms, such as documents, images, mails, and chats. It is captured and curated, stored in structured stores such as relational databases and data vaults and unstructured stores such as flat files and data lakes. Services disclose data, processes process it, and users respond to it.

In most enterprises today there is a “brownfield” situation, where many applications and the accompanying data already exist and there is little freedom to design the data. The architect simply has to accept what is there, analyze what it is and what its characteristics are, and treat it accordingly. In an organization of even average complexity, this is a very difficult job, with many challenges: existing processes are not documented, old projects had waivers and did not comply to current security standards, and so on.

Also, now that companies look outside themselves to find data that concerns them, the task to understand that data and understand how to use it has become even harder. All sorts of unstructured data may be found in social media; whose data is it and does it belong in the enterprise architecture?

Digital Transformation

Digital transformation is the key trend in enterprises today. It has a number of business and technology drivers. “Data centricity” is central to all of them, but the modern concept of data centricity is different than the concept of a decade ago. The idea of “golden sources” and “one source of truth” is fading into the oblivion of idealism. It had relevance in the context of the relational database. In a world with varying data velocity, sources and variety, and with distributed stores, the concept of data-centred architecture has changed. The implications are immense - in how we architect, what we consider to be a source of truth, and how data is used and shared in predictive analytics and for operational and On-Line Transaction Processing (OLTP) purposes.

Understanding the Data

Enterprise data must be understood as information by all the users that need to have access for creating, reading, updating, or deleting it.

Data analytics and “big data” have been hot topics in recent years. They promise the ability to generate insights that give a deeper understanding of the data. These can be important aspects but should not be the main drivers of a data-centered architecture.

The effectiveness of big data analysis has been questioned: see for example Alex Woodie's Datanami article. It is a powerful technique but must be used wisely if it is to deliver benefits. It has often failed by trying to fit data into preconceived notions, or when there is an unexpected new development. For example, a person who went to a shop buy grocery might see or hear something and suddenly drop everything and go to a dealer a few miles away to buy a new car. The grocery store’s analytics would have failed for sure to predict this.

In many cases, the primary need is for a basic understanding of data that is described incompletely or inconsistently, rather than for in-depth analysis.

Enterprise Architecture

Enterprises develop and describe their architectures so that people can understand how they work. If you are going to repair a clock, it helps to understand whether it is powered by weights, a spring, or electricity, whether time intervals are measured by a pendulum, a balance wheel, or a crystal oscillator, and what are the functions of its cogwheels, integrated circuits, or other components. Enterprises are even more varied in their construction than clocks. If you are going to change an enterprise, by introducing new business processes or IT systems, or in other ways, it helps to have a description of its architecture. Having an architecture description also helps if you want to develop a relationship with an enterprise, for example as a business partner.

Architecture, as defined in ISO/IEC/IEEE 42010, is “Fundamental concepts or properties of a system in its environment embodied in its elements, relationships, and in the principles of its design and evolution.” Systems, including enterprises, have architectures. An architecture description is used to express an architecture of a system.

Enterprise architects use architecture frameworks to guide them and help them collaborate. There are many of these. TOGAF®, a popular open architecture framework, is the prime example considered here.

The value of architecture for buildings has long been recognized. The Roman authority Vitruvius wrote over 2000 years ago that a structure must exhibit the three qualities of firmitas, utilitas, venustas: strength, utility, and beauty. This remains true of good enterprise and IT systems today.

Architecture Development

In TOGAF, and many other architecture frameworks, enterprise architecture development starts with the business architecture. Any enterprise processes data for its particular business purposes. With that as the foundation and of prime importance, other architectural aspects are developed. Typically, these include data architecture, applications architecture, and technology architecture.

Data architecture is not the same as data-centered architecture. It is defined in TOGAF (version 9.1) as, “A description of the structure and interaction of the enterprise's major types and sources of data, logical data assets, physical data assets, and data management resources.” Data architecture is needed whether or not the overall architecture is data-centered.

In a data-centered architecture development, the data architecture is considered before the application and technology architectures, and determines or at least strongly influences them. The architect should first understand the business data. Applications and services can then be defined to meet business goals, and technology can be selected to support services, applications and data while meeting requirements for qualities such as efficiency, robustness, and security.

In brownfield situations you have to start with what is there. One approach is to take the existing data as a starting point, evolve business processes to meet new needs using the existing data where appropriate, and build new programs to support them.

Architects often need to look beyond a single business organization, at an ecosystem of different enterprises, each with different business processes and goals. Most enterprises today belong to one or more such ecosystems. Each participating enterprise develops its architecture in the light of the existing ecosystem, and attempts to change the ecosystem to further its own business ends. The ecosystem and its constituent enterprises evolve in symbiosis.

The Data-Centered Approach

Data-centricity is an approach that architects and architecture teams can follow in developing enterprise and IT architectures. It is taking new forms to fit with the modern trends of digital transformation. It results in systems that are simple and robust. It can be applied in situations where the definition of the data is outside the architects’ control, such as “brownfield” developments and business ecosystems where an enterprise is dependent on external data.

Data-centered architecture may not be appropriate in every case, but should often be the approach of choice for today’s architecture developments.

 

What is Old is New Again. Information engineering has been around for more than 20 years and has been used in EA as an alternative for the business function / process centric approach. Information Engineering is a mature approach pioneered by Clive Finkelstein (former IBM colleague of John Zachman) and James Martin. It focuses on the information holdings (including data and knowledge) because they are more stable and enable real business transformation, not just process tweaking. It is no accident that that the data column is the first in the Zachman Framework; it is the fundamental asset for a knowledge-based company. I have employed an information engineering approach, once mentored by Clive,  to transform federal government service delivery from program-centric to life event-centric, in the context of a TOGAF based EA Approach. That work was too much for the existing CXO cadre to absorb at the time when their mindset was reliability based, hyper-risk adverse and education was lacking. Should dust it off and try again now. Fundamentally it works and before starting down a new road with data centric architecture, please re-use the Information Engineering approach.

Thanks Chris for starting this group.. The key objective is to allow information to be created, quickly, the way it is likely to be consumed. Unification of data needs to be a focus esp. in brown field cases. More IMHO while dealing with enterprise data architecture it helps to move from ER (entity -relationship) based ways of modelling..

To view or add a comment, sign in

Others also viewed

Explore content categories