I am writing a book on SAS data warehousing, and I came across your article. I agree with what you said, but I just want to add that I think "fail" means "fails to attract users". The whole point of a data warehouse is to have users use it, and I feel it is a failure if not enough people use it to justify its existence. I'd be really interested if you would ever comment on that topic, because I think a "lack of users" has everything to do with the points in your article, especially a lack of vision as to what questions it is supposed to answer, and lack of attention to ETL. (I'm the ETL queen!) Thanks for the great article!

Like
Reply

Many data warehouses are also slowly dying. From having adhered to some principle to begin with they tend to deteriorate over time towards becoming "just a bunch of tables", which require quicker and dirtier solutions. At some point the maintenance cost when trying to cope with new requirements reaches the point where it is simply cheaper to start from scratch. I can recommend the paper "Data warehouse architectures: factors in the selection decision and the success of the architectures" in which 454 companies were analysed with respect to the successes or failures in data warehousing.

A common mistake is an unwillingness to commit to business keys. Instead technical keys from the source systems are reused. This is usually done because the thought is that there is a 1-1 relationship anyway, and the keys won't change...and then the source system gets a restore and the same business key gets another sequence. Another common mistake is where the DWH follows the source structure, instead of the business processes. A lot of source apps are bought and modified and thus don't fit the business model properly. By not building the DWH according to the buisiness processes (and units of work), the DWH gets stuck in the same complexities that faced the original source system. The DWH however has to tackle this in a short time, whereas the source has often accumulated these solution over several years. Additionally following the source structure creates complications when adding a second source, which has an incompatible structure. By following the business processes you should be able to add multiple sources to common tables. Finally a lot of DWH projects decide to model according the old traditional 3NF data warehouse. This quickly results in large dependency trees and long, slow batches.

I can not agree more with the above, I have remarked many times that a successful data professional must be interested in at least three things: the business, the technology and the business users. This will always be true in the future. If you want to spend your time coding and not talking to the business users, that's great but you don't belong on the data warehouse team...

In addition to your list, which I recognise from my own experience, I have to add two other point. The most important reason for failure is the lack of a cooperation model between users and developers. The answer to the why question has to be broken down in smaller subjects where the users of the information have to do a lot of work. We assume that users know the data or business process. That often proves to be wrong. The process of collaboration will reveal the actual merchanism of the business processes as recorded in the data. All too often that reengineering part of data warehouse creation is misunderstood. The second reason I have to add is lack of data modelling skills. I have seen really bad data modelling habits which prevent future changeability of the warehouse and thus shortening its lifespan. It is closely related to the first point: how can you model if you don't understand the business process?

Like
Reply

To view or add a comment, sign in

Explore content categories