DATA ANALYTICS - GOOD AND BAD
Did you notice that data analytics is becoming an integral part of our personal and professional life? And this change is remarkably fast.
At personal level, even buying a car involves good amount of data analysis. One may look at various options within the segment for price, fuel economy, comfort, advance features, safety, and so much more. Then weigh each of these considerations with own assessment and pick the best option.
At work, data gives us insight on factors affecting the industry, the market, hiring choices, etc. and make informed decisions on almost everything. Although the scale of using data for decision making may vary depending on the impact of a decision on the business.
Data is allowing us to make better decisions for current as well as for future business scenarios. But with so much of data coming in, how do you make sense of it? Are you tempted to give it to your data experts or analysts to figure it out and let them come back with key actions? Well, this works only when you have a clever team handling the data.
It's quite important that as a decision maker, you have a basic understanding of data analytics and how these experts have reached their results or analysis outcome.
Below are a couple of simple examples of analytics that might help you see the nuances of good and bad data analytics.
Let’s look at a couple of GOOD EXAMPLES first!
Simple regression analysis proves highly valuable
A large mining company in Zimbabwe was concerned about over-staffing or under-staffing in various departments and was sensing that they were losing money because of that.
Before doing any analysis, a hypothesis was made as - the number of staff in a department is driven by the business activities it carries out.
A regression analysis exercise revealed that there was a strong relationship between the number of employees and the business activities carried out by their department. See the graphic below. The R square 70.34 means 70.34% of the business activities strongly relate to the number of employees a particular department has.
The analysis clearly told which departments were understaffed and which ones were overstaffed. They could either retrench the staff number from overstaffed departments or relocate them to understaffed departments matching the skills.
Click the link to read the full case study.
Retaining the key talent at Nielsen
Nielsen’s HR was approached by the leader of one of its biggest businesses with a concern that people in her team were leaving. The HR too felt that attrition was rising company wide and they decided to build a basic data model to understand what was happening.
For a company like Nielsen, it was easy to build a nice colorful dashboard with lots of interesting metrics. But they realised - “if this analysis doesn’t answer the critical question of reasons behind the attrition, it’s not worth it.”.
Nielsen created a predictive model with multiple variables, including age, gender, tenure, and manager rating to understand what was going on. The model returned key insights as -
- The first year mattered the most. If employees didn’t even reach it to their first performance review, their likelihood of leaving rose exponentially.
- Gender and ethnicity didn’t play a role in tenure, which went against their initial hypothesis and they were happy to know this fact.
- Although getting promoted pushed people to stay, also, lateral moves were a strong motivator for people to stay.
The important outcome was that the people with the highest attrition risk in the next six months were approached, and the company was able to move 40% of them to a new role. These lateral moves increased an associate’s chance of staying with the company by 48%. Could this be possible with just intuition?
Click the link to read the full case study
If you look at both these examples, the very critical step in data analysis is to create hypotheses and let the data prove or disprove it rather than making decisions purely on just hypothesis/intuition.
A couple of BAD EXAMPLES too!
Pinterest Accidentally Congratulates Single Women on Their Weddings
Social media including Pinterest is a go to place for posting ideas and inspirations through photos and videos, and brides could be ahead in doing this by posting their make-up, hairstyle, dresses, menu and so on. They like to share their joy with their family and friends through it, which is very natural.
There is nothing new that social media collect huge amount of user data to serve their users through relevant content and ads. But Pinterest put themselves in trouble when the website accidentally emailed some users congratulating them on their upcoming weddings. They soon realised that many of these women weren’t getting married, and some of them were actually single. Many of the unhappy recipients shared their outcry on Twitter along with sharing the erroneous email. Pinterest had to issue an apology for this mistake.
While data analytics is extremely useful in running marketing campaigns and driving conversions, it’s important to ensure the data is reliable and accurate. Also, that the data analysis isn’t biased or missing the context, and that the tools and technology used to do this are foolproof.
Correlation and causation
Correlation and causation are terms which are mostly misunderstood and often used interchangeably. Correlation is a statistical technique which tells us how strongly the pair of variables are linearly related and change together. It does not tell us why and how but it just says the relationship exists.
"Correlation is not causation" means that just because two things correlate does not necessarily mean that one causes the other. In many cases, correlation exists because of a coincidence as seen in the chart below.
"eat more mozzarella cheese if you want to become a civil engineering doctorate :)”
Although so important, the phrase "correlation is not causation" is not taken seriously. Our prejudice and suspicions about the way things work makes us believe that correlation is causation without any hard evidence.
Therefore, just after finding correlation, drawing the conclusion too quickly is a poor example of data analytics. It’s necessary that enough data is used to find other underlying factors as correlation is just the first step. It’s important to find the hidden factors, and verify if they are correct and then conclude!
Conclusion
Data analytics is highly valuable to businesses only when it is done in a proper and methodical way.
Have you come across any examples of correlation believed as causation?