The Basic 7 #6 : Part 2 : Control Charts
Hello Readers,
Continuing Number 6 in the basic 7.....A Control Chart !
(Please read my previous Posts to catch-up with this ongoing series)
In this article, I would be re-visiting the detailed preparation of basic types of Control charts.
X-Bar Charts
First, we will consider variables, which are continuously scaled. In theory, a variable can take on any value within a given range, although they are often measured at specific intervals. The average weekly ratings at the hotel are variables, since they can take on any value between 1 and 10.
There are two things we are concerned about with our sample measurements. One is that the mean of the measurements is statistically in control. To be incontrol means that the process merely contains random variation, not assignable variation. One way to analyze the control of a process is with a control chart. To test for statistical control of the variable mean we would use an x-bar control chart.
The term x-bar, denoted X-, represents the mean of a sample of measurements. Each measurement in a sample is an X value, i.e. X, X, X, etc. If the sample size is n, then n of the X value measurements are averaged to calculate an x-bar value. (The bar denotes “mean of values.”)
An x-bar chart is a line graph with each plotted point being the x-bar value for a sample of measurements. We also plot a line on the chart which represents the mean of our x-bar values. The mean of x-bar values is, as you might imagine, called x-bar-bar, denoted X--. X-bar-bar is the mean of a group of x-bar values.
Note that the sample size is five. The sample size is NOT seven, which is the number of samples.
The x-bar values are calculated as the mean of the measurements in each sample. The column labeled R contains the sample range values. The range of a sample is simply the maximum measurement in the sample minus the minimum measurement in the sample. These R values will be used later.
We can calculate x-bar-bar to be 45.63 as the average of the x-bar values. Of course, we would expect some of the x-bar values to be above the x-bar-bar value, and some to be below it. If there is just natural variation, then we would expect a random distribution above and below the mean. If we plot the data we see the following:
We observe that the first three points are above the mean line, but that otherwise, the data seems to be randomly dispersed above and below the mean. The movement around the mean line in this example appears to be natural variation.
Another thing it would be good to know about the x-bar values is how dispersed they are. In particular, we would like to know if a particular x-bar value was outside of a reasonable range. The x-bar values are going to vary somewhat simply due to natural variation. Unusual variation may indication that there is a special cause, or a specific reason, for variation. That special cause may be a problem that needs to be addressed.
Statistically, we can calculate a range of reasonable variation in the x-bar chart. That reasonable range is bounded by control limits. The upper control limit(UCL) indicates the maximum value that is statistically reasonable, and the lower control limit (LCL) indicates the minimum reasonable value. By “reasonable” we mean being likely to occur given natural variation. This is different from “acceptable” in terms of acceptable quality. It is up to management, employees, and customers to decide what is acceptable in terms of quality measurements. SPC control carts instead just look at statistical variation, to help identify if the variation follows a usual probability distribution.
The upper and lower control limits are calculated from some equations based on the central limit theorem, which was mentioned previously. We need to know the standard deviation of the x-bar values, which we can calculate over time. It might be easier to determine the standard deviation of individual measurements, which allows us to calculate the standard deviation of sample means (x-bar values) according to the following equation:
where σ is the standard deviation of individual measurements and n is the sample size. The control limits are calculated by the following equations:
where the z value is the number of standard deviations (sigma’s) from the mean to put the control limits. A “three-sigma” control chart uses z=3, which considers measurements within three standard deviations of the mean to be “natural variation.” For the normal distribution, 99.7 percent of the distribution occurs within three standard deviations of the mean, implying that values will be outside that range only 0.3 percent of the time. In other words, it is quite unlikely that natural variation will cause recurring values outside of three-sigma control limits.
A more sensitive control chart has z=2, which is called a “two-sigma” control chart. Only 95.4 percent of the normal distribution occurs within two standard deviations of the mean, implying that 4.6 percent of the time the values will be outside of that range. This means that a two-sigma control chart is more sensitive than a three-sigma control chart, meaning that it is more likely to detect when a process changes (but also more likely to have a false alarm).
For our example above, imagine that we determine that the standard deviation of measurements is 2.3. We calculate the standard deviation of x-bar values as 3.
For a three-sigma control chart, the LCL is 45.63-3x1.029=42.54 and the UCL is 45.63+3x1.029=48.71. These control limits can be plotted on a control chart as follows:
In this data we conclude that something peculiar might have happened in period 4, since the x-bar value is below the lower control limit. Keep in mind that it is possible that the x-bar value for period 4 is a random occurrence. We need to be careful about jumping to conclusions that something is definitely wrong with the process. However, such a x-bar value is not very likely to occur without a special cause. (Since only 0.3 percent of truly random values will be outside of the three-sigma control limits.) Therefore, it would be good to investigate the situation further.
If we do not have the standard deviation of measurements or sample means, we can use the R (range) values to estimate the standard deviation by using tables that contain control limit factors. The following is a three-sigma factor table.
The A-factor from the table pertains to x-bar charts. The other two factors will be used in a different chart. (Some text books call these factors by different names, such as A, D, and D.) The sample size is the number of measurement in each sample. Be careful to not confuse the sample size with the number of samples. The sample size is not the number of samples in the chart! It is the number of measurements within each sample.
Although you can construct a control chart with a sample size of 2, it is probably not the best. The bigger the sample size, the more statistically representative will be your control chart values. A sample size of 5 or 6 is probably okay. Larger sample sizes are the best, but they often take more effort to gather.
The upper control limit (UCL) for an x-bar chart is calculated as :
where r-bar is calculated as the average of the sample range values, as discussed previously. The lower control limit (LCL) for an x-bar chart is similarly calculated as :
Since the factors were from a three-sigma factor table, we expect that three standard deviations of x-bar values will be between the LCL and the UCL. The central limit theorem shows that the x-bar values tend to follow a normal distribution. Three standard deviations from the mean of normal distribution includes 99.7 percent of the probability density. This means that 99.7 percent of the time, a random number drawn from a normal distribution will be within three standard deviations from the mean. It is quite unlikely that values would fall outside of that range on a regular basis.
From the data above, we can calculate control limits for the example x-bar chart. We have n=5, X-bar-bar=45.63, R-bar=5.43, and A=0.577. Therefore we have LCL=45.63–0.577×5.43=42.50 and UCL= 45.63+0.577×5.43=48.76 . Note that these control limits are almost the same as were calculated above using the standard deviation of measurements.
R-Charts
An x-bar chart tells us if the central tendency (i.e. mean) of the samples appears to be in control over time. Another chart, the R-chart, tells us if the variance within each sample tends to be in control over time. It is certainly possible for the sample means to be in control over time, yet the sample variance is getting worse and worse.
An R-chart is created similar to how we create an x-bar chart except for the following:
- the points which are plotted are the R, or range, values calculated above.
- the central line is the R-bar value, which is the mean of the R values.
- the upper control limit, LCL , is simply the B-factor times R-bar.
- the lower control limit, UCL , is simply the C-factor times R-bar.
Again, since we are using three-sigma factors, we would expect the R values to fall within the control limits 99.7 percent of the time.
For the data from the example above we calculate UCL =2.114×5.43=11.48, and LCL =0×5.43=0. Note that lower control limits for R-charts are bounded by zero, since it is impossible to have R values less than zero.
The control chart is as follows:
We see that in period 7 the range of values appears to have gone up beyond what is statistically common. It is interesting to note that the x-bar value for that sample was fine, meaning that on average the measurements in the sample were in control. However we had more variance in that sample that we might have usually expected, so we should investigate for special causes or changes in the process.
Do we care if an R value is at or below the LCL? (Assuming we have a non-zero LCL) It seems that less variance would be better. In fact it might be good to investigate improvements in the process so that they can be assured to continue in the future.
P-Charts
&
C-Charts
Next we will look at control charts for attributes, which are not continuous variables but are things that can be counted. A p-chart considers the portion of a sample that is defective, where each item in the sample is either defective or not. For example, an airline might track on-time arrivals of flights on each given day. If there are 50 flights being tracked each day (the sample size), we could determine a p value as the portion of flights that arrive late. On a given day if 10 flights arrive late then the p value is 10/50=0.2.
The airline might plot the p value over time on a p-chart. The center line of the p-chart is the average of a series of p values, which is p-bar. The control limits will be above and below the center line the appropriate number of standard deviations (z). A standard deviation for a p-chart is calculated according to the following equation:
where p-bar is the average of a series of p values and n is the sample size (the number of items in each sample). The control limits are simply:
However, note that a p value can never be negative, implying that the LCL should never be negative. If the LCL calculates to be negative then use LCL=0.
For example, the airline might have the following late arrival data for the past seven days:
Using the equation above we determine that sigma-p-bar=0.0556. For a two-sigma control chart (z=2) we calculate control limits of LCL=0.025 and UCL=0.358.
It looks like the p values are well within the control limits, with no unusual patterns. We might conclude that the late arrival process, while not ideal, is in statistical control.
Finally, a c-chart considers attributes that can be counted, without a specific sample size. Instead we have a sample frame which defines a range of defects to be counted. For example, the airline might count the number of complaint letters that come in day to day. There is no sample size since any number of letters could arrive (and an individual might even send multiple letters).
The plot values for a c-chart are the c values, which are the counts for each sample frame. The center line is c-bar, the average of a series of c values. We often assume that the count values follow a Poisson distribution, which has the standard deviation as follows:
For example, if the airline might count the number of complaint letters that arrive over the past seven days as follows:
Using the equation above, σ=sqrt(10)=3.162. For a three-sigma control chart (z=3) we calculate LCL=0.513 and UCL=19.478. As with p-charts, the c values will never be negative, so if the LCL computes negative then use zero. Here is a c-chart for this data:
This data is in statistical control. Note, however, that if we used two-sigma control limits (z=2) then points would be outside of the control limits, suggesting the counts of complaints is not in statistical control.
|| Summary of Things to Look For ||
What are we looking for in a control chart that might cause us to suspect the process has changed? The following are some examples:
- An unusual tendency for the sample values to be above or below the x-bar-bar or R-bar line. “Unusual” might mean five or more in a row.
- An unusual tendency for the sample values to be near a control limit. Here, “unusual” might mean two or more values that are very near the control limit.
- Values that appear outside of the control limits. If three-sigma control limits are used, then it is quite unusual for even one value to appear outside of the control limits.
- Other peculiar patterns in the data, such as erratic fluctuations above and below the mean, or patterns that repeat over a fixed number of samples.
Again, with each of these occurrences we investigate for special causes. Such patterns of behavior can happen with mere natural variation, but they are not likely.
|| Updating Control Limits ||
Once you have constructed a control chart based on currently available data, you can construct a control chart. It is probably a good idea to have at least five to ten samples of data to construct your control chart. Then, as you collect more samples of data, they can be appended to the control chart. As long as the process appears to be in control, it is usually not necessary to recalculate the control limits and central line with each new sample–just use the ones you originally calculated. However, if it is determined that the process has changed, then it would be good to gather measurements from the new process as the basis for re-calculating control limits.
Thanks for reading & watch out for my next article on Quality Tools for Process Improvements
# 7 : Check Sheet