Statistical Process Control
Few months ago, I posted an article on comparing KPI's. I illustrated a methodology that compares the observed KPI to a predicted range: if the KPI falls outside of the predicted confidence range, we should take a closer look for signs of problems.
Because the predicted range is estimated from a multivariable model, this approach assumes we have some underlying understanding of the system we are seeking to understand. What if we don't?
Statistical Process Control
Statistical Process Control (SPC) is a collection of inferential methods used to monitor and identify sources of variation in a process. The objective is usually that of making an informed decision on whether or not to continue a production process. There are two focus areas in SPC: process control and acceptance sampling. In this article I will focus on process control. SPC was pioneered by Shewhart and later popularized by Deming.
Walter Shewhart (left) and Edward Deming (right).
The aim of process control is that of understanding whether deviations in a process are due chance variation or a yet-to-be-known assignable cause.
When the state of knowledge around a system or a process is limited, SPC's univariate methods can help us formulate a question and focus the investigation.
Example
Donald Wheeler in his book Understanding Variation, has numerous real-world examples. Here, I illustrate one. The table below (Wheeler, 2000) represents a common management report:
Table 1
The percent difference column (5th from left) shows current vs. monthly average. One number that stands out is the +42% increase in In-Process Inventory. A legitimate question to ask ourselves is: is this unusual?
Data
To find out, we will need the raw data for In-Process Inventory. There are two things to notice. First, in this particular case we are dealing with cumulative data (counts), therefore, sampling is not a concern (on the other hand, imagine if we were dealing with weights of chocolate bars). Second, there should be no fundamental changes in the process for the selected time window.
Table 2
Control Charts
Two charts when used in conjunction can help assess whether or not a value of 42% for July is unusual: a time-series and a moving range chart.
The moving range is calculated by taking the absolute difference between two consecutive months. Next, we compute the upper and lower confidence limits by taking the mean of the confidence ranges and multiplying by 3.27.
For the time series portion, we compute the upper and lower control limit by taking the mean and multiplying by the moving range and by 2.66. The resulting charts are as follows:
Readers interested in the statistical intuition, can refer to one of the references.
Interpretation
First take a look at the bottom chart, the moving ranges. The upper confidence limit (UCL) is 15.3, meaning that for a value to be unusual, the difference between a month and the next would have to be greater than 15.3 in either direction. There is no lower confidence limit (LCL) in moving ranges (recall we took the absolute difference).
The upper chart, the time series, show how large an individual value needs to be to be considered unusual. In this case, the UCL is 33.
None of our monthly In-Process Inventories ever came close to the UCL either in the moving ranges or the time series. Therefore, we see no evidence of a significant departure from normal process variation.
Another Example
Let's now look at On-Time Shipments. Refer back to table 1. The difference is -0.3%. Intuitively, this is small and nobody would loose sleep over it. Let's take a look at the control chart:
Here, we see a different picture. What appeared as a mere 0.2%, contains significantly more variation. The time series chart displays two values below the LCL bar and one on the line itself. The moving ranges chart displays one value above the confidence limit and one very close. Something happened in May of year three, May and August of year one. These times should be investigated to understand what drove on-time shipments to change.
SPC as the Basis for Business Improvement
If we act on chance variation with no evidence for action, as suggested by the SPC approach or the predicted range, not only we would be wasting time and resources, but we would open the system to distortions.
Percent differences, as those seen on monthly reports and scorecards, are misleading and inappropriate summary of change. Instead, SPC charts or predicted ranges should be used. Sparklines can easily be adapted to include confidence limits and these can substitute percent change, arrows, or red/yellow/green.
It is not uncommon for business targets to be set. However, when those are arbitrarily set with no consideration of what the system is capable, they will "always create a temptation to make the data look favourable. And distortion is always easier than working to improve the system" (Wheeler, 2000).
In 2016 it was discovered that employees at Wells Fargo created millions of phony accounts. Wells Fargo was fined nearly 200 millions. The phony accounts were created in order to meet sales targets.
References
Donald Wheeler, Understanding Variation The Key To Managing Chaos, (2000)
Douglas Montgomery Introduction to Statistical Quality Control (2012)
Peihua Qiu Introduction to Statistical Process Control (2013)
About the Author
Thomas Speidel, P.Stat., is a Canadian Statistician. He spent ten years working in cancer research before moving to the energy industry. Thomas is often seen writing and commenting on issues of statistical literacy on LinkedIn, Twitter, several blog and is a co-founder of About Data Analysis, a LinkedIn group.
LinkedIn: ca.linkedin.com/in/speidel/en
Twitter: @ThomasSpeidel
About.me: http://about.me/Thomas.Speidel
Go you! SPC is so valuable, yet more or less unknown to so many analytics professionals.
Very interesting and informative article, Thomas. I will steal some of it. Kudos.
Statistics can be Fun by Wendell Abbot,published in 1954 was distributed by many collaborators by ´80s and discussed as an introductory tool to learn why and how to use SPC. One thing is to see a photo taken at X time,other is to see the Complete Film,were many photos are included.Lack of SPC,means that you cannot assure any process is under control,and predictable.How can you make a Budget,or project any future without knowing variability of the different factors that influence in the results ? SPC+Fishbone+Management Style+Employee Involvement+Variability Analysis,are essential tools.Only after applying this and SoPK, you can really know the truth.
Imagine no chart at all and single individual problems are reported and acted on arbitrarily with ZERO data. They are acting on only when a person wants to let other people know they are doing something. Of course the acting on is never followed through and the net result is nothing ever changes. Because the response has nothing to do with improving anything.
Shrikant Kalegaonkar, good comments except that your point 3 shows a little confusion. Control limits ARE the natural limits for a process, for any type of Shewhart chart. I suspect that Dr Wheeler's paper here has led to your mix up. https://www.qualitydigest.com/inside/statistics-column/010416-statistical-tolerance-intervals.html In this paper Dr Wheeler attempts to highlight the folly of folk treating Shewhart charts as probability charts. He shows the difference between a confidence interval, tolerance interval and control limits. Not hard to guess what he's talking about with "one computation that modern software offers to unsuspecting users". "Thus, tolerance intervals have a different purpose than the three sigma limits of a process behavior chart. Any attempt to use tolerance intervals as a substitute for three sigma limits for individual values reveals a fundamental lack of understanding of these profound differences between the two techniques."