12 Angry Data Points
I often feel like the lone voice in the room expounding the importance of having a focus on data quality. Hopefully by the end of this article I’ll have brought you all round to the concept, just like Henry Fonda in the film from which I’ve taken the inspiration for the title of this article. Tenuous link out the way, let’s get to it.
There are all sorts of stats out there about data quality:
“40 percent of businesses will fail to achieve their business objectives due to poor data quality.” (Gartner, 2011).
“92 percent of organizations find some aspect of data quality challenging for them and 63% lack a centralised approach to managing data quality.” (Experian, 2015)
“56 percent express concern about the integrity of data upon which they base decisions” (KPMG's Global CEO Outlook, 2018)
Right now you’re probably thinking: “Yeah? So what?” Compelling as these stats are, they don’t really paint a picture of real world issues.
But there are some great examples of catastrophic failures when it comes to data or data processing quality. Take for example the Enron scandal of 2001 where a host of fraudulent financial data was directly attributed to the downfall of the company. It went from being worth nearly $50bn to bankrupt in less than a year. Or the example of NASA’s Mars Orbiter which was lost in 1999 due to a simple error of not converting units from imperial to metric during a calculation step, resulting in the loss of the device from a programme costing >$325m. This latter example highlights one of the key facets of data quality: being able to understand the provenance and lineage of the data. After all, without context a number is just a number.
You can also find some perhaps more relevant, but less over the top examples out there too (e.g. this one about a mis-input prescription, or this one about a car parts manufacturer).
Hopefully by this point I’ve convinced you that data quality is important and can have a direct and profound impact on your business.
This has never been more true than in today’s data driven world. With the advent of the Internet of Things, more companies investing in ERP, CRM and other acronyms, and with social media and marketing feeds coming in thick and fast, businesses can quickly become overwhelmed with data. As volumes go up, so does the risk of data quality issues, and the complexity associated with mitigating these risks and managing the data.
A lot of people make the mistake of thinking this is a new problem that can be solved with new systems, but it is a problem as old as the concept of business. Consider the world of business when 12 Angry Men came out in 1957 (see – I can make more tenuous links back to it!): You would be lucky if your business knew what a computer was, let alone had one, or a database (in fact the first hard disk drive had just been invented - see picture of an IBM Model 350, the first ever hard disk drive invented in 1956 and capable of storing just under 5MB.) . Despite this you still had reams of data, you just called it something else and the reams would be actual paper. A ledger. An inventory. “The books”. Files. Cases. Filing cabinets. Drawers. I could go on. The point is, we’ve always had to manage data and make sense of it. We used to have less of it, and it was less accessible, but it has always been there. In many ways, because it was so inaccessible we were much better at managing it. We used complicated index card systems, file rooms and staff were employed simply to ensure things were archived properly. But as computers came onto the scene and businesses were able to clear their file rooms and put things into little black boxes, there seemed to be less need for the admin. Out of sight, out of mind. Whilst we must now have huge administrative savings over the business world in the 1950’s, perhaps, because the data we have is now somewhat hidden, we’ve forgotten that it needs managing. I’m not saying we need to go back to having hundreds of people maintaining ledgers, copying things out and summing things by hand, but I am saying that we need to put more effort in if we want to reap the benefits. Without this we risk being in a constant cycle of adding more rubbish into an ever growing rubbish bin and not realising benefits such as better demand forecasting using machine learning, better understanding of things that drive outstanding performance or the best way to re-activate a dormant customer through advanced analytics.
I guess the question to end on for now is: how confident are you in your data? Maybe it’s time to do something about it.
Excellent article....it is amazing how much money is wasted in business collecting things multiple times as they don't realise they already have it. My favourite data quality story is when the RN were installing new missiles onto warships and they did a database search on all serving warships that were in dock. They got a surprise in Portsmouth when they turned up.....can you work out which famous ship nearly got missiles? now for the shameless plug....Marine Data Management Courses run by OceanWise and IMAREST - check out https://www.imarest.org/events-courses/training-courses/marine-data-management-awareness-course. Learn how to better manage your data!!