HR Analytics: Is HR ready for Big Data?
Summary: There is a lot of talk about Big Data but does it really work, is it relevant to HR and should we be investing to build new capabilities? The real example of Google’s Flu Predictive algorithm both highlights the power of Big Data and the pitfalls of an emerging technology. Before risking an investment in Big Data, HR should first explore the simple and fast way to People Insights – the readily available, underutilised data that sits inside every organisations’ HR systems.
Go to Qlearsite for other resources and the original article.
The ‘Big’ Challenge.
‘Big Data’ is here and everyone is talking about it. But that doesn’t mean everyone understands it. What is Big Data? How should we use it? What are the benefits? Curiosity about these questions has made Big Data a rapidly trending search term on Google, increasing by over 2,000% in recent years. The interest and excitement is palpable. But what are we excited about? Is that excitement matched by reality? How close are we to making better decisions, using increasingly large datasets?
Whisper it, but Big Data might be overhyped. Or at least, not yet fully grown up. Although the internet is flooded with people talking about Big Data, few businesses are making use of this technology. Bain reports that even amongst the largest corporates, with vast resources (i.e., sales >$1bn), only 4% have adopted Big Data in any part of their business.
Businesses are seemingly cautious. Perhaps because Big Data can sometimes be dangerously wrong. Even the famously data savvy wizards of Silicon Valley have been caught out. An example is Google’s algorithm to predict outbreaks of flu, a great idea that got a lot of people excited - an early warning system that dispensed medical supplies before outbreaks occurred, minimising contagion and saving lives. Unfortunately the algorithm was a famous failure, despite the fanfare that greeted its arrival.
Google built their algorithm by selecting the ‘top 1000’ search terms (from over 50m) that best correlated with influenza levels. The idea was simple – flu levels can be linked to search terms such as ‘why am I shivering?’ and ‘hot chicken soup recipe’. Any spike in the correlated search terms would help predict the beginnings of a flu outbreak. Unfortunately, amongst the ‘top 1000’ were several terms that were pure coincidences. For example, ‘basketball’ was included as a predictor. It just happened that the NBA season started in winter, causing a spike in this term at the exact same time as flu levels rose. Clearly basketball does not cause flu, but it was highly correlated. Or at least it used to be highly correlated until the season start dates changed.
Terms like ‘Basketball’ created noise in the prediction and in 2009 an unseasonable outbreak of flu was not predicted or anticipated. Don’t worry, the story ends well. Overtime, Google vastly improved the accuracy of their flu predictions by improving the algorithm and incorporating other sources of data. But the point is this – we applauded the predictive power of the algorithm before it was proven. Big Data is an emerging science and we should be careful how we use it. In this case, perhaps hubris overtook reality and we prematurely celebrated the end of unexpected flu outbreaks.
The Choice for HR
HR is a fast emerging consumer of analytics. Just this week Josh Bersin published a new blog called ‘People Analytics Takes Off’. There is a clear need in HR for better, more accurate and more affordable People Insights, but does that include Big Data techniques?
Qlearsite’s answer – as the developer of a powerful HR Analytics software – is that ‘Big Data’ for HR isn’t ready yet. There are two principle reasons. Firstly, most businesses are still working hard to get accurate information from their core HRMS system – without this foundation, they’re not ready for a Big Data approach. Secondly, there is a huge amount of information, lying dormant inside businesses that offer value insights and are more far readily available – for example, messy, unstructured data.
Unstructured data is estimated to represent 80% of the information stored in an organisation. By auto-reading and analysing the messy text from thousands of performance reviews, exit interviews and other HR conversations, we can create real stories about people’s experiences at work. This enables us to break out of drop-down list classifications and start to explore themes created from the words of our employees. Look out next month for an article on how we are pioneering this and other techniques with leading UK businesses.
Furthermore, even the basic data sitting inside a core HRMS or payroll system can be very revealing. For example, a simple employee register contains basic information (e.g., employee number, start date, etc.) that can be converted into a compelling picture of an organisation. Organisation structure, attrition rates, promotion rates, spans of control and much more besides are hidden inside this simple and often neglected file. All it requires is some careful processing using HR Analytical software - or a lot of manual crunching in excel - to reveal powerful people insights.
Our point is this – before we explore external data using complicated and sometimes inaccurate Big Data techniques, let’s start with the data about our own people, sitting dormant inside our HR system. There are powerful people insights waiting to be discovered, even when internal data isn’t perfect (see note 1).
A Bigger Future
As analytics becomes more prevalent in HR teams, we will find more examples of Big Data – especially in recruitment analytics where external data is most important. But for the moment, the opportunities to exploit the data inside our organisations seem a simpler, faster and more valuable opportunity to focus on. We believe HR Analytics is a journey that starts with a firm foundation built upon internal data. Only when organisations are data mature should they consider supplementing insights using large external datasets. Big Data for HR could be big … but right now, it is still small.
Go to Qlearsite for other resources and the original article.
Note 1: We are often told by HR teams that their databases might contain errors or even have gaps. In our experience, these gaps are relatively easy to repair to an acceptable level of accuracy (i.e., the 80:20 rule). Bots can semi-automate this process by doing data checking (e.g., spotting duplicates) and repairing basic errors. This saves a significant amount of time and replaces an otherwise laborious, manual data checking process. The fear of such a manual process is often the reason that the data remains unrepaired and this source of invaluable, powerful insights is unused.
I agree with Peter Clark about the rubbish-in rubbish-out indiscriminacy and precariousness of unstructured big data, and his example of google's attempt to predict a flu outbreak is a classic, not that they didn't have lots of evidence, like HP's botched acquisition of Autonomy for £7bn, to tell them it wouldn't. Despite HR getting the least funding within many organisations, in some ways it is ahead of many departments at learning from structured data. My own company provides clients of our Cloud-based software with a great reporting tool, which we also use across the aggregated client base to create anonymised trending reports which have turned out to be amazingly accurate and give phenomenal foresight: not to give away too many industrial secrets, talent pools have become more successful at hiring than job-boards by far, and social media hiring is bombing into oblivion.