Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Apr 21, 2017 10:01 AM
| Last Modified: Apr 21, 2017 11:43 AM
Process capability analysis assesses how well a process produces product that is within specifications. The standard process capability indices – Cp, Cpk, Cpl, and Cpu – are ratios that relate process performance to specification limits that are popular and widely used in industry. It's important to pay careful attention to assure correct use of these indices. A process must be in statistical control for these indices to have any meaning; and in order to make conclusions about process conformance, the standard indices require the assumption that the measurements come from a normal distribution.
Nonnormally Distributed Data
Even small departures from normality can greatly affect the error associated with using these indices to estimate percent nonconforming. This means that when we erroneously assume the normal distribution, we risk seriously overestimating or underestimating the percent of nonconforming product. Because nonnormally distributed process measurements frequently occur in industry, several methods have been developed that create analogous process capability indices for nonnormally distributed data. These methods allow us to perform accurate nonnormal process capability analysis if we are careful to choose an appropriate distribution fit for the data.
The JMP 13 Process Capability Analysis platform has been enhanced to allow nonnormal capability analysis. It is now easy to fit nonnormal processes with a variety of distributions as well as assess and compare the chosen fit among several distributions. Two nonnormal capability indices methods are also available: the Percentile and Z-Score methods. I will show some of these new features with an example and then highlight the risks of incorrectly assuming the underlying distribution is normal.
A pharmaceutical company produces a drug that has six impurities remaining at the end of manufacturing. The company needs to keep these impurities below 0.5% of the drug. From experience, scientists at the company know that the percent impurities come from a lognormal distribution. Let's examine the capability of their manufacturing process to keep these impurities within specification using the Process Capability platform in JMP 13.
Figure 1 shows how to set up the Process Capability launch dialog to do nonnormal capability analysis. I assigned columns Impurity 1-Impurity 6 as processes and then set the process distribution to Lognormal for all processes. I'm using the default nonnormal distribution option to calculate the non-normal capability indices with the Percentiles method. (Note: I saved the upper spec limits as column properties in the data table so that I did not have to enter the spec limits through a dialog.)
Figure 1: Process Capability Launch Dialog
Visualizing Process Capability
Figure 2 shows the initial process capability report after the platform is launched. The Capability Index Plot is a quick way to visualize how well the processes are meeting specifications. The processes with markers above the Ppk line surpass the company’s Ppk threshold of 1.33 needed to be considered capable. (Notice that the horizontal axis will show the type of distributional fit in parentheses after each process name unless the fit is normal.)
This plot shows that impurities 1, 5, and 6 surpass the Ppk=1.33 threshold and appear to be capable. The processes for removing impurities 2, 3, and 4 appear to not be capable. However, it is important to make sure that the lognormal fits are reasonable before we draw conclusions based on these indices. To examine these processes and the distributional fits in more detail, let's turn on the Individual Detail Reports from the menu.
Figure 2: Capability Index Plot
Figure 3 shows the Individual Detail Report for Impurity 1. The report shows a histogram of the data with the lognormal fit density curve, a summary of the process information, nonnormal capability indices, parameter estimates for the fitted lognormal distribution, and the nonconformance report. To assess how well the lognormal distributions fits the data, I choose Compare Distributions from the Impurity 1(Lognormal) Capability menu.
Figure 3: Individual Detail Report for Impurity 1
Figure 4 shows the initial Compare Distributions report inside the Individual Detail Report for Impurity 1 with the chosen lognormal density curve fit on the histogram along with fit statistics for the lognormal fit in the Comparison Details table. To compare the lognormal fit of Impurity 1 to the other parametric distribution fits, I check the boxes for Normal, Gamma, Johnson, and Weibull.
Figure 4: Compare Distributions Report for Impurity 1
Figure 5 shows the Compare Distributions report with the other parametric distributions checked. For each checked distribution, the corresponding density curve and fit statistics appear in the histogram and Comparison Details table. The Comparison Details table sorts the distribution fits by the AICc criterion by default.
It appears that the lognormal fit is the best fit according to all three criteria shown. The density curves on the histogram offer a nice visual assessment of the fit for each checked distribution. But to gain even more insight from the distributional fits, we can look at the Probability Plots. Since the gamma distribution fit appears to be a close second to the lognormal fit, I uncheck all of the distributions except gamma and lognormal.
Figure 5: Compare Distributions Report for Impurity 1 with all parametric distributions checked
Figure 6 shows the Compare Distributions report with only the gamma and lognormal distributions checked. I then choose Probability Plots from the Compare Distributions menu.
Figure 6: Compare Distributions Report for Impurity 1 with gamma and lognormal distributions checked
Figure 7 shows the Probability Plots for impurity 1 for the fitted gamma and lognormal distributions. A good fit is indicated by the points following the diagonal line. The lognormal distribution seems to be a reasonable fit to the data and looks better than the gamma distribution fit. I am happy with the choice of the lognormal distribution fit for this process. However, if the probability plots had looked different and gamma was a better fit, I could have changed the Selected radio button to Gamma. Changing the selected distribution of a process updates all the capability reports and graphs with results that use the selected distribution.
Figure 7: Probability Plots for gamma and lognormal distribution fits
After repeating this analysis and assessing the lognormal distribution fits in the Individual Detail Reports for all of the impurities, we can confirm that the lognormal distribution was a reasonable fit for all of the processes. Let's choose Summary Reports from the main Process Capability menu to get an overview of the capability indices and nonconformance. Figure 8 shows the Overall Sigma Capability Summary Report table.
Examining the Expected % Above USL or equivalently the Expected PPM (parts per million) above USL, we notice the high amounts of expected nonconforming drugs with regards to impurities 2, 3, and 4. We also once again note the low Ppk values for these impurities. This table along with the Capability Index Plot that we looked at initially (Figure 2) help us see that the pharmaceutical company needs to improve the processes for removing impurities 2, 3, and 4 in its drug.
Figure 8: Summary Report with lognormal distribution fits
If We Had Assumed Normality...
To highlight the importance of using nonnormal capability analysis with an appropriate distribution when normality cannot be assumed, we can run this process capability analysis using the normal distribution fit for all the impurities. Figure 9 shows the Overall Sigma Capability Summary Report that assumes the normal distribution for all of the processes.
Figure 9: Probability Plots with normal distribution fits
Notice that all of the Ppk indices that assume a normal distribution are higher than the ones that assumed a lognormal distribution, and impurity 3 (assuming normality) exceeds the Ppk=1.33 threshold and would be considered capable. Notice also that when we use the normal distribution, we expect to have far less nonconforming product than if we had used the lognormal distribution.
Incorrectly assuming normality could put the pharmaceutical company at risk of having a lot more nonconforming product than its scientists were expecting. Thankfully, the ability to choose an appropriate distribution fit and perform nonnormal process capability analysis in the Process Capability platform helps the company to perform a more accurate assessment of their process capability.