Data, Models, Architecture
Photo taken from https://www.pca-stream.com/en/articles/the-creation-of-value-through-architecture-71

Data, Models, Architecture

1. Introduction

There appears to be considerable terminological confusion in the field around concepts such as data, models, architecture, data models, data architecture, and BI architecture. What kind of "beasts" are they? What do they look like? Yesterday (08-April-2021), this discussion popped up again in a whatsapp conversation. In this note, I'm sharing my current thoughts on the matter.

Please note that I am in no way claiming that this is "the ultimate set of definitions". Through my studies of systems theory, complexity theory, the Cynefin model and several other scholars (e.g. Bateson, Habermas, Hoppenbrouwers), I'm increasingly aware of the personal nature of language and the importance of languaging (as a verb, stressing that we should pay attention to how we and others use language). I'm offering my views I hope to offer a useful perspective on.

2. Models

There are many definitions of what a 'model' is. Most boil down to "an abstraction of reality" or some variation on that theme. Recently I read a different approach in the work of Proper and Guizzardi, building on the Ogden's semiotic triangle. Their definition of model is: "A model is an artifact that is acknowledged by an observer to represent an abstraction of some domain for a particular purpose", illustrated by the following figure:

No alt text provided for this image

The idea is that a modeller observes some domain/ scope (I will use UoD to denote the Universe of Discourse). Through mental processes, we build up an understanding of this UoD, which we can express in an artefact. If an observer acknowledges that this artefact can stand as a model for the UoD then this artefact is said to be a model for that UoD. Points arising:

  • In some approaches (e.g. the FRISCO report) a distinction is made between models (a mental construct) and model representations (an artefact). I've used this distinction for years as well. These days, I am more inclined to follow the approach proposed above.
  • Models need not be in boxes and arrows. I could also use lego-bricks and a stop-motion movie to create an artefact that can stand as a model for some UoD.
  • Whether something is (better: can stand stand as) a model for a UoD is subjective: different modelers/ oberservers may disagree. This stresses the need for constructive communication. Modeling is a "team sport", in my view.
  • In their research, Proper and Guizzardi derive a taxonomy for different types of modeling goals, most notably a) models for understanding the UoD, and b) models intended to bring about a change in the world.

3. Architecture

The term 'architecture' is notoriously difficult to define. A dear colleague (Hans Bot) wrote a comment years ago that we should stop talking about architecture, and start doing architecture instead. I couldn't agree more. Still, I find that it sometimes helps to sit down and capture my current thinking.

Following the excellent book by Greefhorst & Proper, my definition of architecture is "those properties of an artefact that are necessary and sufficient to meet its essential requirements". the adjective "essential" refers to the question "what does/should keep stakeholders awake at night?". In other words, architecture is about designing artefacts that helps stakeholders to get some sleep.

In our world, we tend to make architectures of systems (software systems, organizations seen as systems, etc.). Since systems typically involve components, the Greefhorst & Proper definition can be specialized to the ISO/IEC/IEEE definition: "the fundamental properties of a system, embodied in its components, their relationship to each other and to the environment, and the principles guiding its design and evolution". Points arising:

  • We make architectures of some UoD, capturing only the essential/ fundamental properties. As such, architectures can stand for the UoD and are a model. The inverse is not true (i.e. nog all models are architectures).
  • The adjectives "essential" and "fundamental" allude to the fact that we worry only about the big picture, and the most important characteristics of the domain. Again, this is a subjective assessment that may depend on the concerns of stakeholders/ modelers/ architects involved.

4. Data

Defining 'data' is not as easy as I'd like. I could attempt it by referring to the semiotic ladder (which I hope to write on more soon), or Shannon's information theory. I won't go there for the time being. For now, I'll stick to an observation. It seems a safe statement that data is about "something". Take, for example:

No alt text provided for this image

I am inclined to believe that this data can stand for reality and therefore that this already a model of reality. Perhaps not a common view, but I do think it is a useful one as it stresses the need for checking whether the data in our databases corresponds to the real world.

5. Data models

The simple definition of a data model is: a model of the data. If we apply the idea of model as proposed by Proper and Guizzardi, that would mean: that we study data (the UoD) and create a model (the artefact) that can stand for the data that we have studied. Again, the key is that the modeler/ stakeholder acknowledges that the artefact that we produce (typically: boxes and arrows) can stand as a model for the data. Points arising:

  • The term 'data' is typically used in a relational sense (i.e. relational theory as invented by E. F. Codd, described by - among others - C. J. Date).
  • Side note: I am increasingly inclined to follow Date's argument that Tables and Relations are two different things, and as a corollary Tabular Databases might be a better name than Relational Databases.
  • The process of database normalization (UNF - 1NF - 2NF - 3NF - BCNF) helps to ensure that the artefact/ model that we develop (table structures and constraints, potentially visualized as an Entity Relationship Diagram) are a good model of the data. Continuing the previous example, we could state that "People have residence in a City" as a model. This would raise the question: in one city? or at least one city? If we look at the data (i.e. the table presented above), it seems that this is not the case. If we apply further domain knowledge, we could argue that people may have more than one residence, depending on the time of year. Basically this would mean that we revoke our earlier statement that the presented table can stand as a model for what is going on in the UoD. We would have to collect more data and start the whole process over again.
  • Not all structured data is relational (e.g. hierarchic data may modelled in an XML Schema Document (XSD)), and not all data is structured. The same line of thinking applies, however.

6. Data architecture

Following the same approach as with data models, data architecture can be defined as the architecture of data. Or, perhaps more precisely, the architecture of the data landscape. This would mean that we try to understand and document (a) what the fundamental properties of the data landscape are, and (b) what the underlying principles are. In other words, a data architecture addresses the key concerns that stakeholders have with respect to data.

Data is, in and of itself, not very useful nor very interesting. In a sense, it provides the "glue" between business (processes) and IT (systems). This broad view is reflected by the concerns that a data architecture should address:

  • How is the data organized in subject areas / data clusters and what is the underlying logic behind this clustering?
  • Following the work of Jeanne Ross: what are the requirements around standardization and integration of data across the enterprise?
  • Following the work of Damhof: how do we deal with integrating data use and creation in structured/ opportunistic settings? (The "Damhof model" is highly recommended)
  • How do we deal with data integration challenges across the enterprise? And what is the interplay between "on premise" and "the cloud" in this respect? What are the key patterns that we want to use? (The book by Piethein Strengholt is highly recommended).
  • How do we enable value creating with data while ensuring that we have sufficient grip on our data? (see the article by DalleMule and Davenport)

7. Business Intelligence architecture

Following the same approach as with data models and data architecture, business intelligence architecture (BI Architecture for short) can be defined as the architecture of the business intelligence capability of the organization. This would mean that we try to understand and document (a) what the essential characteristics are of the organization of the BI capability are, (b) what the underlying principles are. Again, this can only be done by taking the key concerns of stakeholders into account.

This, again, requires a broad perspective. Typical aspects that are taken into account when considering a capability are people (what knowledge, skills, and mindset do people need), organization (what are formal roles and responsibilities), process (is there an ordering of activities, and if so what is the organizing logic behind it), information (what data/information do we need to implement the capability), and technology (what types of systems do we need to implement the capability). All of this, of course, at a high level of abstraction. Typical concerns are:

  • What type of insights do we expect to make through the implementation of our BI architecture? E.g. operational reports per/ across applications, exploratory analyses, predictive analyses, etcetera.
  • Given the overall architecture of the enterprise, what is the organizing logic behind the capability? Is our vision to realize one (central) capability with one central data warehouse? Or do we allow a more diverse approach with e.g. data lakes, data virtualization and other platforms? Will all software components be realized on premise, in the cloud, or in a hybrid setup? Do we allow different business units to make their own choices, or is our vision to keep everything under central control?
  • What is the link with other data management capabilities, more involved with ensuring data is available at the right time with the right quality for the right stakeholders?

8. Reflection

In this short article, I have presented my current understanding of several concepts. Looking back at my research notes of the last few years, I see that my views have evolved. Slowly, but still. It is interesting to capture my insights from time to time, and to link them back to experiences in the field, to reference works (TOGAF, DMBOK, etc.), and to scientific publications. Thanks for all the good conversations and as always, if you have something interesting to read: do send some references my way!


Literature

L. DalleMule & T. H. Davenport (2017) What’s your data strategy. Harvard Business Review95(3), 112-121.

C. J. Date (2004) An introduction to Database Systems. 8th edition. Addison Wesley

C. J. Date (201) E. F. Codd and Relational Theory: a Detailed Analyses of Codd's Major Database Writings. Lulu Publishing Services

B. van Gils (2020) Data management: a gentle introduction - balancing theory and practice. Van Haren Publishing.

D. Greefhorst & H.A. Proper (2011) Architecture principles: the cornerstones of enterprise architecture. Springer Science & Business Media.

G. Guizzardi & H. A. Proper (2021) On understanding the value of domain modelling (forthcoming)

Falkenberg, E., Hesse, W., & Lindgren, P. (1998). FRISCO–A Framework of Information System Concepts–The FRISCO Report. IFIP WG 8.1 Task Group FRISCO. 

ISO/IEC/IEEE 42010:2011 - Systems and software engineering - Architecture description. International Organization for Standards. 24-nov-2011

C. K. Ogden & I. A. Richards (1923) The meaning of meaning – a study of the influence of language upon thought and of the science of symbolism. Magdalene College, University of Cambridge

H. A. Proper & G. Guizzardi (2021) On domain conceptualization (accepted for publication). Enterprise Working Conference

P. Strengholt (2020) Data Management At Scale. Best practices for enterprise architecture. O'Reilly

Ross, J. W., Weill, P., & Robertson, D. (2006) Enterprise architecture as strategy: Creating a foundation for business execution. Harvard business press.


No alt text provided for this image



Data, the subject with two faces; On one hand you only want/need that data to perform the responsibility delegated to you in the context of an agreement. On the other hand you cannot have enough data to analyze for new commercial opportunities or possible threats. minimum of data For processing data in the context of fulfilling the conditions of an agreement you only need that data that is related to the responsibility, not more not less. When you have less data you can not perform the delegated responsibility properly, when you have too much data the risk is there you perform responsibilities that are out of scope from an architectural demarcation of responsibilities. This last risk is a serious and present risk causing many incidents in production because a change didn’t take the unknown functionality into scope and/or you have a massive impact analysis and governance to do for each change. The impact towards the organization is that you lose a great deal of the agility you need as an organization to anticipate on new commercial initiatives and/or new or changing legal obligation you have to comply to

Like
Reply

maximum of data for analyzing your customer base for new commercial opportunities or even gain new customers you can never have enough information to query around. The same goes for risk analysis where even biometric data could be of great help to lower or better exclude the risk of making new deals or processing erroneous transactions. So I fully agree with above statement that you should model for a UoD, there is not just one data model, the perspective of how you look at your universe matters. Furthermore data models are artifacts, the questions is whether they need to be maintained or developed every time when you need a data model. The data model serves a purpose, i.e. to align the understanding of the Universe of Discourse in a community, it is not only the artifact itself but also the process of creating it that establish a common ground to cooperate. Only when you experience something it comes alive!

Like
Reply

So much philosophy, so little knowledge.

Like
Reply

Really? I realized that 5 decades ago as soon as I got into the industry and the confusion has become constantly worse -- because most of it is marketing drivel. Which is then taken seriously and "analyzed" and "explained": books, articles, seminars, consulting. A very profitable enormous waste of time and resources.

Like
Reply

Bas - I found this comment - and many others in your article - an interesting and increasingly pervasive sentiment in data managemment circles: "Data is, in and of itself, not very useful nor very interesting." Given the amount of resources and other attentions bestowed upon data this seems unecessarily perjorative. If someone (or some 'thing') took the time to create data of interest to a business and if we assume that finite resources were committed to making sure the data is clean, relevant, timely, and reusable how can it be said to have little or no value? I suppose you could argue that you didn't say that data has no value, but if that was the case why was it recorded and stored in the first place? If something is not useful or of interest why bother with it?

Like
Reply

To view or add a comment, sign in

More articles by Bas van Gils

  • February photo musings

    In December 2025, I picked up a camera again for the first time in ~20 years. I treated myself to a Leica M11p.

    5 Comments
  • Photo journey - January

    Last year, I bought myself a Leica. It still feels a bit "over the top" but here we are.

  • Photography journey

    2025-12-27 Leica store Amsterdam When I was younger, I learned photography with an Olympus OM2n that my father owned…

    3 Comments
  • The 2025 rollercoaster

    Introduction In December, I always write a short review of my year. It is good to look back and reflect on what…

    20 Comments
  • Sustainable datamanagement

    Every now and then, you can almost “hear” the click when meeting someone new. That happened to me when I met Lotte…

    6 Comments
  • A review of the ArchiMate(R) NEXT snapshot

    Disclaimer The ArchiMate language was originally developed in the scientific community with several peer-reviewed…

    13 Comments
  • Reflection on the first half of 2025

    It is not the mid-year point yet, but I was very much in the mood to write another reflection today. The idea came to…

    5 Comments
  • 2024 reflection

    It has become a tradition to write an end of year reflection and post it here, on LinkedIN. Typically I do this by…

    4 Comments
  • The new generation, democracy, and data

    Once or twice per year, I take the time to sit back and reflect, sharing my thoughts on what happened and what I expect…

    5 Comments
  • Review: data meesterschap

    Recent zag ik op LinkedIN de aankondiging van het boek "data meesterschap" met als ondertitel "praktijkgerichte…

    1 Comment

Others also viewed

Explore content categories