Data Vault - The basics
Please refer to the earlier article, as this article is a continuation of that. https://www.garudax.id/pulse/data-vault-underlying-principle-madani-basha/
We had spelt out that the data vault design is underpinned by a fundamental principle, viz., "Separate out stable things from things that are less stable".
We will see how the trio of Hub-Link-Satellite constructs are in accord with the above principle. Let us use the following example to illustrate. The diagram below represents a design following the "Normalised Design Technique". Let us assume that there is just one source system. We will address the more realistic scenario of multiple source systems in the follow-on article/s. For now, let us keep matters simple.
The model shows that there are CUSTOMER-s and a CUSTOMER must belong to a CUSTOMER_GROUP.
The vault design would be as follows:
Recommended by LinkedIn
Key points to note are:
As can be seen the vault design technique has indeed applied the principle of "Separate things that are stable from things that are less stable", as can be contrasted with the normalised design. This has resulted in 6 objects in the vault design as opposed to just 2 in the normalised design. Well .. nothing comes free. The increased number of entities (and the SQL joins necessary to corral the data) are the price to pay for gaining the agility.
The data vault design as shown in this article is intentionally different from what one would see in an implementation of a data vault design. The illustration in this article has intentionally avoided the surrogate key for the entities. The surrogate keys are a necessity for the implementation, but not needed to understand the data vault design per se. We will address and explain the need for surrogate keys and their use in the follow-on article/s.
For now the intent is to emphasise the fact that
Watch the space for further articles on Data Vault.