When NOT to use Control Charts
I have spent a great deal of my time in teaching people how and when to use control charts but not very much in when NOT to use them.
The rules around what the data must look like for a control chart are not well understood, luckily they are pretty simple;
- Your data do not have to be normally distributed (this is a misconception)
- Your data do have to be approximately symmetrical
- Your data must be random
The last point is the one that is more interesting and where I am going with this article. I have come across many situations where process improvers use control charts on processes they know not to be random, that is processes that drift or change in predictable manners such as sales or tool wear.
As with my other articles I am going to show you how it is done using an example in Minitab ...
The control chart shown below an individuals and moving range chart of viscosity measurements taken hourly from a chemical mixing process. The data is "borrowed" from Introduction to Statistical Quality Control by Montgomery. You can find the Minitab worksheet "Viscosity Measurements.mtw" by following the link.
I am not showing how to complete these steps in Minitab (you could always buy my book!).
Notice the number of points outside of the control limits - we could conclude that the process is under poor control. Because the KPOV is viscosity we might suspect that it would slowly drift and hence not change suddenly, in that case the data is what is known as autocorrelated, this means that each observation depends on the last one so it is not truly random.
To evaluate this we will use the autocorrelation function(ACF).
Which will then produce the following chart and output in the session window.
The chart is in many ways easier to understand - note that the observations one hour apart (the Lag) are correlated with a correlation coefficient (the ACF) r1= 0.82.
Observations 2 hours apart in sequence are correlated with a correlation coefficient r2 = 0.70 (ok it was 0.6995 so I have rounded it up). Similarly, for 3 and 4 hours apart r3 = 0.60 and r4 = 0.49. Since each falls outside the +/- 2 sigma limits (red lines) we would say these lag autocorrelations are statistically significant - each observation depends on at least one previous one, in fact in this case each observation depends on the 4 previous ones.
This is characteristic of naturally drifting processes.
Also, since there is decaying drop-off in the correlations, we refer to this as an autoregressive process.
If we wished we could draw scatter plots of the correlations by manually lagging the data - I will leave that one as an exercise for the reader but please do reach out to me for help if you need it.
This level of autocorrelation can distort the decisions made from a Shewhart Control Chart (an IMR chart for instance) such that we could conclude that a process is out of control when in fact it is not. This is referred to as a false alarm and can lead to more variation in the process due to over interference by the operator or engineer.
So, when the process observations are highly autocorrelated what can we do to still correctly employ SPC such as an individuals chart?
The approach I will show here is to model the autocorrelation that we know exists and perform the SPC on the residual noise variation once the autocorrelation is removed so we can detect special cause spikes in the process or shifts or trends.
The approach that we will use is to apply the ARIMA model (Auto Regressive Integrated Moving Average).
Also make sure that you store the residuals, we are going to use them shortly.
Having done all this we will get a residuals plot automatically, in this case the residuals show evidence of being independent and normally distributed suggesting the model is accounting for the variation structure adequately.
We also get the actual model of the data shown in the session window - interesting, and it shows our model as significant, but not really what we need right now.
What we have done up to this point is to model the variation - the variation that we know is there - so is to proceed with a control chart on the residuals, what is left after we have stripped out the known variation, to see if the process is out of control and when it occurred. We will produce an Individuals control chart of the residuals in the normal fashion.
Notice that the process does not show any evidence of being out of control once the autocorrelated structure in the data due to dependency among successive observations is modelled and removed. Such dependency is often due to the slowness of the process to change relative to the sampling frequency, that is, the sampling interval is much shorter than the time constant for the process dynamics.
In this case we could reduce the sampling rate without compromising the control of the process, this may have little importance if there is an on-line viscometer or other automated process control where data gathering is cheap and control is automatic but it could be a major saving if viscosity is read in the lab and over correction is done leading to greater process variation.
Key Learning Points:
- Observations taken in time sequence are often autocorrelated and hence are not independently distributed and the assumptions underlying conventional control charts are violated
- This can impact the number of false out-of-control signals if the process is being monitored via an SPC chart leading to over correction
- The autocorrelation can be modelled and the control chart can be applied to the residuals to more correctly show out-of-control situations
- This saves potential unnecessary and harmful interventions
About the author
Michael D Akers is the author of "Exploring, Analysing and Interpreting Data with Minitab 18", he is a Lean Six Sigma Master Black Belt with 30 years of practical application of problem solving in a range of industries, he consults around the world and his book is available on Amazon and Apple's iBooks platform.
Michael, thanks for this post. I was curious why you chose to use an I chart on the residuals rather than an IMR? Was that just for brevity in the article or deliberate? I’m thinking that MR doesn’t really work here because we are typically doing things like Box-Cox in ARIMA or Multiplicative Errors in Exponential Smoothing in order to stabilize the variance.
I studied autocorrelation and ARIMA in a post degree Math Course a few years ago. I never truly understood the ‘point’ of it and simply learned how to conduct the analysis by ‘rote’ to pass the exam. Now in the space of 20 mins on a long train journey I finally ‘get it’! Thanks Michael, a great article.
Great article, thanks for sharing.
Been preaching about ARIMA for decades. Nice to see
Alwan, L. C. and Roberts, H. V. (1988), “Time Series Modeling for Statistical Process Control,” Journal of Business and Economic Statistics, 6, 87-95.