The Case of the Bus Matrix
The Bus Matrix is one of the most important assets of an enterprise data warehouse, yet many of the solutions I have seen through the years either do not use it and lack to understand why it is so important.
Why is it so important? How can one create it? How can it be used?
Before we can go into what the Bus Matrix is, I want to clarify some basic concepts about Business Intelligence and Data Warehousing in general. Companies uses BI solutions (reports, dashboards, advanced analytics models etc.) because companies want to gain transparency into their day-to-day performance of the business processes they execute as a company. Like all companies, they do it because they want to improve the results of these business process.
A business process is the activities that are executed in the company as part of its operation. Taking a phone call, processing an order, delivering a product and so forth.
You cannot improve the execution of a business process, without measuring the metrics of the business process.
Ralph Kimballs dimensional modeling is therefore built on top of these business processes. The Enterprise Data Warehouse is the sum conformed and integrated models, each model aimed to collect the metrics of a single business process. Because each model is working on a single process, we can build it accurately and separately from other processes. Because it is conformed and integrated, we can combine metrics from multiple processes into a single report in a relatively simple way.
The Bus Matrix is the high-level model over the business processes that are executed daily/weekly/monthly by the business. It is a simple list of the business processes, explained in a simple business language, it tells us what the business process is, how often it occurs, what we can measure, and who are the key players in the process that describes it. Essentially the Bus Matrix lists the fact tables and the dimension tables we will need to build. It can be created and understood by the business, and it stores all the information the data team needs to deliver good business value.
It takes roughly 1-2 days to investigate and model one business process into the Bus Matrix. All we need is a subject matter expert who can describe the process and a data architect with knowledge of dimensional modelling to get it done. We will need to walk through the 4 steps of dimensional modelling which are:
After the workshop is over, we have a logical data model, that can be handed over to a team of data engineers that can build the fact and dimension tables needed to deliver the business value to the business users.
Let's walk through an example. Say our company is a lemonade stand selling tasty lemon juice at the street corner on hot summer days. We can easily identify the first 2 key business process of the company (we can easily identify more than two, but we do not need to have all og them to get started, and we can always add more processes in the future).
We then choose to prioritize the first process, and we are already beginning with step 1 Identify the business process. We meet the subject matter experts, and we learn that the company Happy Lemonade sells lemonade in glasses of 33 cl. We learn that there is a table placed on the corner of the street, and that each weekend from 10 to 14, they offer customers buy a glass of lemonade for 1$ or a bottle of 1 lt for 3$. We also told that there is a gallon of 50 liter at the beginning of the day, and that when they sold out, they close the stand earlier than 14:00.
Customers can get a discount if they have a loyalty card and follow the company on social media. Loyalty card holders can purchase 5 cups for 4$. Loyalty card holders can also purchase 2 bottles of lemonade for 5.5$. Not all customers have a loyalty card, so the lemonade booth do not know all the customers that come and purchase lemonade.
We now know that the Selling Lemonade process is a transactional process, that is, every instance of the process stands by itself. (Other options are snapshop and periodical snapshot for ongoing processes). We also perform step 2 and can identify the grain of the process, which is selling glasses or bottles of lemonade to customers.
Recommended by LinkedIn
We can then update our Bus Matrix with the following:
Now we move forward to step 3 which is to identify the dimension, the objects in the process. These are the loyalty program number, product and the date and time of the sell. Note that the grain states that we capture each product separately. The loyalty program number is not a mandatory dimension, so we mark it with an O for optional.
In step 3 we also note the keys and attributes of the dimensions.
Lastly, we need to identify the measures, which are how many units we sold, for what price and what is the volume of the product. We can also measure the discount that was given for the customer.
The end result is a model of a single process, and by looking into it we know that we need to have a transactional fact table with 4 dimensions columns and 4 measures. We need to create a date and a time dimension plus a loyalty program and a product dimension.
Without writing a single line of code, we can test the model by asking questions to the model and evaluate if we can meet the reporting needs based on the model.
We then repeat the same 4 steps of the other processes, and we get this:
If we have fact tables and dimension to match the Bus Matrix, we can perform complex analysis across the 4 processes, as the 4 fact tables are linked via shared dimensions in the model. We can easily find which batches have most waste, and manage our purchase orders, and manufacturing process to meet with the expected demand.
You can use the Bus Matrix to drive the prioritization of the development team. It allows you to apply agile methodologies as you are able to de-select parts of the bigger picture, without missing out on important parts, that leads to restructuring of the work already done.
Wow! This is a great read I just found in Google search. I am actually using your template...very well done! Thank you!