When the average is a liar
When it comes to analysis, it’s easy fall into the trap of understanding the sum and then average and then stop. We might chuck a standard deviation metric into our analysis every now and again, which probably never gets read or understood. But some of the lesser known, easily calculated and super powerful metrics in the statistical universe that have become my favourites are the Coefficient of Variation (CV), Kurtosis and Skew.
It’s unfortunate that the CV has such a terrible name. Doesn’t exactly roll off the tongue. Kind of makes you feel like Professor Frink from the Simpsons. Busting out a “What’s the Coefficient of Variation” isn’t buying anyone any street cred.
But it is a very simple metric.
For a given dataset, you take the standard deviation and divide it by the average.
But why? If I barely care about standard deviation, why would I do more work on top of it?
Standard deviation tells you how variable your data set is. But it isn’t relative, so used in isolation or across different datasets, it might not be as impactful.
Imagine I have 2 customers, A & B. Customer A has an average order value of $1000, and a standard deviation of $100. Customer B has an average order value of $200, and a standard deviation of $30. CV of customer A is 10% and customer B is 15%. It makes the volatility in the data set abundantly clear. We don’t want people doing math in their heads when we display metrics, so we want to give it relativity and scale.
Easy right. The CV helps us understand how much our data fluctuates around its mean.
But wait - Bonus marks. If we’re using some good statistical tools like KNIME, we can very easily pair what we learn about a dataset when looking at the CV with its Kurtosis and Skew.
Kurtosis is an obscure hero of statistical analysis, lurking in the shadows while mean and variance hog the spotlight. It measures the "tailedness" of the distribution of dataset. In simpler terms, if I plotted out all the datapoints on a frequency distribution to show how many times I get a single result, the kurtosis tells me about the tails to the left / right of that distribution curve —how fat or skinny they are. And why should you care? Because those tails can tell you a lot about the underlying data and its potential for surprises.
Recommended by LinkedIn
1. The Long Tail (Leptokurtic Distributions) – Kurtosis > 3.0. Leptokurtic distributions have longer tails and a sharp peak in the middle. In the world of data, this means more values are in the tails and fewer are near the mean. More volatility.
2. The Short Tail (Platykurtic Distributions) – Kurtosis < 3.0 On the flip side, we have platykurtic distributions. These have shorter tails and a flatter peak. Here, the data points are more spread out around the mean, and extreme values are less likely.
3. The Goldilocks Zone (Mesokurtic Distributions) – Kurtosis = 3.0 Right in the middle, we have mesokurtic distributions, akin to a perfectly balanced cake—neither too flaky nor too dense. If you imagine a bell curve, you have a Mesokurtic distribution.
A few years ago, I used the combination of Kurtosis and CV to segment and understand the buying patterns of a list of thousands of customers based on hundreds of thousands of orders. Which ones put in consistent frequent orders, which ones put in volatile, unpredictable orders. Which ones should get a regular EDM to bump up their regular orders by a little. Which ones need love and attention in the one or two months when they put in an order. It helped make targeted marketing decisions across a large customer base.
Last, my favourite performance metric. Skew. I loooove seeing the Skew and distribution curves when understanding if assets are performing. If someone states, “I need a another machine because this one is at capacity”, the skew will tell me. Skew tells me if that distribution curve is sloping left or right.
Assume you’re analysing a metric for which a higher frequency is better like widgets produced per hour (as opposed to say breakdowns per hour for which lower is better).
Edge of the limit right slope - If your distribution curve is sloping to the right, in Aussie terms we would say “you’re flogging the pants off that machine”. More often than not, you’re running it close to its maximum achievable capacity. In this case, your average may be dragged down by lower outliers. A right sloping curve can also indicate that your process is under control. You know how many you can do and you hit it, often, unless you don’t need to be at the limit.
Lazy left slope - If it slopes to the left, it is usually operating below what it can, and it is capable of more, and because it achieves more, your average may be higher than the normal operating activity level. If I have a lazy left slope but a machine is hitting capacity at certain times, my mind immediately goes to “can I even out the workload and spread it through the operating period, rather than investing in a new machine to handle the infrequent peaks”.
In the world of data, being average is overrated – we want to embrace and understand the variability A distribution curve will tell you much of this at a glance, and if you have the option of either showing a distribution curve or single number of an average, I would always encourage showing the distribution curve. If, however, you’re making heads and tails thousands of objects, have a look at the CV, Kurtosis and Skew, and let them be your guide.