Good data, bad data
Data makes the world go around….
There are lots of discussions about data at the moment. Some telling us why we’re missing out if we don’t get involved in the world of data and others prophesying that data is the future.
Is it as straightforward as we’re being told though and is all data good data?
As with all things data can be as complex or simple as we want to make it.
There is no doubt that having good data helps the business see things from new angles and puts a different perspective on things, whilst also providing a new level of insight into your business.
Bad data does neither of these and can make a murky situation even cloudier
How to separate the good from the bad
I recently attended a Business Analytics Forum hosted by DeeperThanBlue where there was an extremely interesting presentation by Paul Husbands from IBM who shared the chart below. This was part of a wider presentation about the use of data to make confident decisions but I think this chart highlights in a visual way the good and the bad aspects of data and data capture.
As a guide, the process that data goes through on the x and y axis increasing both capability and value is made up of the following four stages:
1. Descriptive Analytics
As the name implies, descriptive analysis or statistics can summarise raw data and convert it into a form that can be easily understood. They can describe in detail an event that has occurred in the past. This type of analytics can yield useful information and possibly prepare the data for further analysis.
This is the most frequently used type of analytics across organisations. It’s crucial in revealing the key metrics and measures within any business.
2. Diagnostic Analytics
The obvious successor to descriptive analytics is diagnostic analytics. Diagnostic analytics is a deeper look at data to attempt to understand the causes of events and behaviours.
In a structured business environment, tools for both descriptive and diagnostic analytics go hand-in-hand!
If your organisation makes it this far with data analytics then it is seen as being quite advanced in comparison to its peers.
3. Predictive Analytics
Any business that is pursuing success should have foresight. Predictive analytics helps businesses to forecast trends based on current events. Whether it’s predicting the probability of an event happening in the future or estimating the accurate time it will happen can all be determined with the help of predictive analytical models.
Usually, many different but co-dependent variables are analysed to predict a trend in this type of analysis. For example, in the healthcare domain, prospective health risks can be predicted based on an individual’s habits/diet/genetic composition.
4. Prescriptive Analytics
This type of analytics explains the step-by-step process in a situation. For instance, a prescriptive analysis is what comes into play when your Uber driver gets the easier route from Gmaps. The best route was chosen by considering the distance of every available route from your pick-up route to the destination and the traffic constraints on each road.
In essence, prescriptive analytics is what you want to happen in the future.
These descriptions relate to the data analytics on the chart that adds value and capability but what about the bad data that appears on the z axis running away from the chart.
This sort of data can still be included in descriptive and diagnostic data sets but this data can be extremely misleading.
5. Missing Data
The first of these is missing data. If the data is missing there isn’t an impact on the data set but it may not necessarily tell the full story. Data may be missing because people have accidentally left it out. Essentially missing data is where there are empty fields that should contain data.
6. Incorrect Data
You may have input your data with good intent but the data that is input is incorrect. It could be something as simple as data being entered into an incorrect field, a spelling mistake, or data that hasn’t been normalized as per the system of records.
7. Misleading Data
Misleading data is where someone has intentionally included data that shouldn’t be included in a data set or they are aware that by including certain data it doesn’t tell the story that you are trying to convey.
8. Misrepresented Data
Again, this sort of data can be purposely entered incorrectly, or as an example, it could be that an online form is asking for place of birth and your place of birth isn’t listed so you select the nearest location. In this instance, through no fault of your own, the data is misrepresented.
There are of course numerous ways that data can be manipulated or incorrectly used and the above covers the majority of reasons. Others could include:
Duplicate data: a single account, contact, lead, etc. that occupies more than one record in the database.
Poor data entry: misspelling, typos, transpositions, and variations in spelling, naming or formatting.
Can good data be used for bad deeds?
Based on the above good and bad data examples it’s interesting to look at a real-life example of data being collected and used.
Disney World launched its innovative MyMagic+ programme in 2013. Every guest gets their own MagicBand wristband, which acts as identification, hotel room key, tickets, FastPasses, and credit card.
Guests swipe the band at sensors located around the park to gain entry to attractions or pay for items, giving Disney a wealth of data on where its guests are, what they’re doing and what they may need. This data allows Disney to anticipate guests’ needs and delivers a personalised experience.
The idea is to create a totally frictionless experience for visitors, for example, by letting visitors reserve their place for certain rides and attractions before they leave home, so they don’t have to queue when they get there. Long waiting times are one of the biggest challenges for any theme park – because, when you’re queueing in line, you aren’t spending money in the park’s shops or restaurants. So any progress in eliminating long lines is critical for both guests and the business.
Using data gathered from guest wristbands, Disney can understand where the busy parts of the park are, and make smarter decisions to relieve the pressure, such as incentivising guests to go on another ride or adding more staff in congested areas. This allows for more efficient use of the park.
Disney’s Next Generation Experience project meant that the company applied for a patent for a camera-equipped robot that tracks visitor movement throughout the park via their shoes. This data could be used to understand not only which rides are most popular, but also which routes and paths around the park are used the most. It could also be used to deliver operational improvements, such as more efficient scheduling of the nearly 250,000 shifts of 80,000 employees (and that’s just per week), or improving marketing decisions, using past data on guest preferences and behaviours to design new packages and offers. Disney is even toying with the idea of robotic characters that would move around the park and mingle with guests.
In terms of technical details, the wristbands incorporate RFID and long-range radio technology. They communicate with the thousands of sensors around the park and stream real-time data for analysis. According to the Disney website “The RF devices are not GPS-based and do not enable collection of continuous location signals. Instead, MyMagic+ uses both short- and long-range readers located within the Resort to deliver the benefits of MyMagic+.”
One of the interesting things about Disney’s MagicBand is its potential to be seen as big brother watching. Nobody likes the idea of machines knowing their every move. Yet, as the data and technology fuels a superior customer experience, does this offset the potential invasion of their privacy. Would you be happy knowing that the data you are producing by walking around the park and going on various rides is being used to know where you are at a particular time?
This all goes to show that customers will willingly part with an awful lot of data, so long as it’s ultimately worth their while.
Data can be used to back up gut instinct, or it can be made to
As has been demonstrated already data can be used in a lot of different ways and it doesn’t always mean that the data is correct.
Data can be manipulated to build a narrative or tell a story that you want to tell.
The former British prime minister Benjamin Disraeli is famously quoted as saying that: "There are three kinds of lies: lies, damned lies, and statistics." Between all of that falls data, data which is used to illustrate trends but can also be used to demonstrate the things you want it to show.
This is where critical thinking and common sense need to be used. Before accepting the raw data and the conclusions presented at face value have a think about who has presented the data and why they would want to present such a message. It is very likely that the information is factually correct and backs up the gut instinct of the person presenting the data, but on the other hand, it could have been manipulated to some degree.
If people focus on a half-truth message without considering whether it actually makes sense, or thinking about how the data was collected, or whether the author has an agenda then the validity of any chart could be compromised.
So what really makes data bad? Not so much the data but the way people blindly interpret it.
Data, if used correctly, can give a competitive advantage
Data on its own adds very little value to an organisation, it needs to be processed to generate insights. Insights are generated through the descriptive and diagnostic modelling techniques discussed at the start of the article.
However, insights on their own do not directly add value, the value comes when the organisation uses the insights to make a decision that will lead to more value.
Value could be in many forms, for example, increased sales, better service levels, increased efficiency through operations, engaged and loyal customers, or simply recognising that some activities contribute very little to the performance of the organisation.
Predictive and prescriptive data modelling are the next step in data analytics and this is what can drive the business forward and give a competitive advantage.
Once your organisation has mastered these data analytic areas you can move on to artificial intelligence (AI) and chatbots!
Bringing it all together
Data analytics can be an extremely complex area and as discussed here there are many advantages for your organisation in getting this right.
By creating good data and pulling insight and foresight from it, as well as making sure that bad data is removed, you are able to see a clear way to move into the realm of predictive and prescriptive analytics which will mean you stay ahead of the curve and be able to confidently make decisions based on your data.
#financefundamentals
#RobinKiziak is an experienced finance manager with over 10 years experience in the distribution/logistics and retail industries
Arun Sharma
Good insights. Well done.