Statistical Process Control

Statistical Process Control

Few months ago, I posted an article on comparing KPI's. I illustrated a methodology that compares the observed KPI to a predicted range: if the KPI falls outside of the predicted confidence range, we should take a closer look for signs of problems.

Because the predicted range is estimated from a multivariable model, this approach assumes we have some underlying understanding of the system we are seeking to understand. What if we don't?


Statistical Process Control

Statistical Process Control (SPC) is a collection of inferential methods used to monitor and identify sources of variation in a process. The objective is usually that of making an informed decision on whether or not to continue a production process. There are two focus areas in SPC: process control and acceptance sampling. In this article I will focus on process control. SPC was pioneered by Shewhart and later popularized by Deming.

Walter Shewhart (left) and Edward Deming (right).

The aim of process control is that of understanding whether deviations in a process are due chance variation or a yet-to-be-known assignable cause.

When the state of knowledge around a system or a process is limited, SPC's univariate methods can help us formulate a question and focus the investigation.

Example

Donald Wheeler in his book Understanding Variation, has numerous real-world examples. Here, I illustrate one. The table below (Wheeler, 2000) represents a common management report:

Table 1

The percent difference column (5th from left) shows current vs. monthly average. One number that stands out is the +42% increase in In-Process Inventory. A legitimate question to ask ourselves is: is this unusual?


Data

To find out, we will need the raw data for In-Process Inventory. There are two things to notice. First, in this particular case we are dealing with cumulative data (counts), therefore, sampling is not a concern (on the other hand, imagine if we were dealing with weights of chocolate bars). Second, there should be no fundamental changes in the process for the selected time window.

Table 2

Control Charts

Two charts when used in conjunction can help assess whether or not a value of 42% for July is unusual: a time-series and a moving range chart.

The moving range is calculated by taking the absolute difference between two consecutive months. Next, we compute the upper and lower confidence limits by taking the mean of the confidence ranges and multiplying by 3.27.

For the time series portion, we compute the upper and lower control limit by taking the mean and multiplying by the moving range and by 2.66. The resulting charts are as follows:

Readers interested in the statistical intuition, can refer to one of the references.

Interpretation

First take a look at the bottom chart, the moving ranges. The upper confidence limit (UCL) is 15.3, meaning that for a value to be unusual, the difference between a month and the next would have to be greater than 15.3 in either direction. There is no lower confidence limit (LCL) in moving ranges (recall we took the absolute difference).

The upper chart, the time series, show how large an individual value needs to be to be considered unusual. In this case, the UCL is 33.

None of our monthly In-Process Inventories ever came close to the UCL either in the moving ranges or the time series. Therefore, we see no evidence of a significant departure from normal process variation.


Another Example

Let's now look at On-Time Shipments. Refer back to table 1. The difference is -0.3%. Intuitively, this is small and nobody would loose sleep over it. Let's take a look at the control chart:

Here, we see a different picture. What appeared as a mere 0.2%, contains significantly more variation. The time series chart displays two values below the LCL bar and one on the line itself. The moving ranges chart displays one value above the confidence limit and one very close. Something happened in May of year three, May and August of year one. These times should be investigated to understand what drove on-time shipments to change.


SPC as the Basis for Business Improvement

If we act on chance variation with no evidence for action, as suggested by the SPC approach or the predicted range, not only we would be wasting time and resources, but we would open the system to distortions.

Percent differences, as those seen on monthly reports and scorecards, are misleading and inappropriate summary of change. Instead, SPC charts or predicted ranges should be used. Sparklines can easily be adapted to include confidence limits and these can substitute percent change, arrows, or red/yellow/green.

It is not uncommon for business targets to be set. However, when those are arbitrarily set with no consideration of what the system is capable, they will "always create a temptation to make the data look favourable. And distortion is always easier than working to improve the system" (Wheeler, 2000).

In 2016 it was discovered that employees at Wells Fargo created millions of phony accounts. Wells Fargo was fined nearly 200 millions. The phony accounts were created in order to meet sales targets.


References

Donald Wheeler, Understanding Variation The Key To Managing Chaos, (2000)

Douglas Montgomery Introduction to Statistical Quality Control (2012)

Peihua Qiu Introduction to Statistical Process Control (2013)


About the Author

Thomas Speidel, P.Stat., is a Canadian Statistician. He spent ten years working in cancer research before moving to the energy industry. Thomas is often seen writing and commenting on issues of statistical literacy on LinkedIn, Twitter, several blog and is a co-founder of About Data Analysis, a LinkedIn group.

LinkedIn: ca.linkedin.com/in/speidel/en 

Twitter: @ThomasSpeidel

About.me: http://about.me/Thomas.Speidel





Go you! SPC is so valuable, yet more or less unknown to so many analytics professionals.

Very interesting and informative article, Thomas. I will steal some of it. Kudos.

Like
Reply

Statistics can be Fun by Wendell Abbot,published in 1954 was distributed by many collaborators by ´80s and discussed as an introductory tool to learn why and how to use SPC. One thing is to see a photo taken at X time,other is to see the Complete Film,were many photos are included.Lack of SPC,means that you cannot assure any process is under control,and predictable.How can you make a Budget,or project any future without knowing variability of the different factors that influence in the results ? SPC+Fishbone+Management Style+Employee Involvement+Variability Analysis,are essential tools.Only after applying this and SoPK, you can really know the truth.

Imagine no chart at all and single individual problems are reported and acted on arbitrarily with ZERO data. They are acting on only when a person wants to let other people know they are doing something. Of course the acting on is never followed through and the net result is nothing ever changes. Because the response has nothing to do with improving anything.

Shrikant Kalegaonkar, good comments except that your point 3 shows a little confusion. Control limits ARE the natural limits for a process, for any type of Shewhart chart. I suspect that Dr Wheeler's paper here has led to your mix up. https://www.qualitydigest.com/inside/statistics-column/010416-statistical-tolerance-intervals.html In this paper Dr Wheeler attempts to highlight the folly of folk treating Shewhart charts as probability charts. He shows the difference between a confidence interval, tolerance interval and control limits. Not hard to guess what he's talking about with "one computation that modern software offers to unsuspecting users". "Thus, tolerance intervals have a different purpose than the three sigma limits of a process behavior chart. Any attempt to use tolerance intervals as a substitute for three sigma limits for individual values reveals a fundamental lack of understanding of these profound differences between the two techniques."

To view or add a comment, sign in

More articles by Thomas Speidel

  • Fitting models is what we do for fun, when all the tedious work is done!

    As we continue to evolve at Suncor, I’m really excited about data literacy and technology playing such a big part of…

    4 Comments
  • Keeping Up With Data Science Innovation Part 1: Podcasts

    I get asked a lot how to keep up to date in the field of data science. Here are some of the resources I use.

    2 Comments
  • Single Point or Repeated Decisions?

    The work of a data scientist often results in one of two main outputs: single point decisions and repeated decisions…

    2 Comments
  • Single Source of Truth?

    Administrative data are data collected for the purpose of administering a service or for internal reporting…

    1 Comment
  • Rare Events & Cloud Services: a Winning Synergy?

    Most of my statistical formation happened in cancer research. Several types of cancer are considered rare diseases.

  • Substantive & Empirical Models

    In a previous post, I wrote about what models are and how they are chosen. However, I did not make justice to a broader…

    1 Comment
  • Do Your Homework Before Analyzing

    Many empirical studies confirm that in the course of building predictive and explanatory models, incorporating domain…

    1 Comment
  • Yes, But What if You're That One?

    Growing up, I recall my aunt buying a lottery ticket nearly every week. I used to tell her to save the money, she…

    2 Comments
  • What's in a Model?

    A key concept in the world of Data Science is that of a model. A model is simply a generalization of a reality…

    3 Comments
  • Comparing KPI's

    Organizations are often interested in comparing performance metrics (KPI) such as web traffic, sales, safety…

    3 Comments

Others also viewed

Explore content categories