BookmarkSubscribeSubscribe to RSS Feed

Non-normal data Capability Analysis

Highlighted
ss2980

Occasional Contributor

Joined:

Jul 11, 2018

Hi everyone,

 

I am trying to calculate CpK for two different parameters. Data for both does not have a normal distribution. I followed the tips provided in the following post: 

https://community.jmp.com/t5/JMPer-Cable/Process-Capability-Analysis-for-nonnormal-data/ba-p/38112

cdist_att1.JPG

Based on this analysis Mixture of 3 normals looks the best (AICc). Then I plotted the individual probability plots.Showing below are the top 3 probability plots including "Normal".

prob_plots.JPG

Based on this analysis the calculated CpK (Mixture of 3 normals) was 0.65 as opposed to the one calculted with "Normal" which was 0.9.  Is this approach correct?

 

The second paramter is even more complicated as the data is skewed mostly due to values below LOQ of the assay reported as 109. 

 

Any help is appreciated!

Thanks

7 REPLIES
cwillden

Community Trekker

Joined:

May 1, 2017

Well, I certainly wouldn't trust the normal estimate, but I'm not sure I'd trust any of those.  Stability is fundamental assumption for the capability indices.  They don't have to be normal, but the data needs to maintain a constand mean and variance.  In the other thread, Mark and Mike talked about understanding why you have multiple modes.  That doesn't absolutely mean you don't have a stable process because sometimes there are perfectly reasonable reasons to have multi-moded data and they can't be "fixed."  However, you really need to make sure you do your due diligence.  If the assumption of stability is not met, then your capability estimates are probably not indicative of the capability you will have in the future.

 

That being said, if you feel confident your process is stable, then by all means base the capability calculations on the most appropriate fitted distribution.  An alternative is non-parametric capability, which doesn't assume any distribution.

-- Cameron Willden
ss2980

Occasional Contributor

Joined:

Jul 11, 2018

Thanks Cameron for the quick response! Is there a way to identify these multiple modes in JMP 14 and analyze them?

Also I forgot to mention that some of the data points highlighted below were determined to be due to sample prep error.  Can these be elimintated from the process capability or sample prep or mixing errors are considered part of your process?

sample prep error.JPG

 

Also the goal is not only to calculate CpK but change the specs (tighten or broaden) depending on the results from Capability analysis.

cwillden

Community Trekker

Joined:

May 1, 2017

I think you can eliminate the those samples that you know were prepped incorrectly since you have an assignable cause.  Your objective is to generalize to unsampled product, and unsampled product is not prepped at all for testing, so I would say that it's not part of the process.

 

JMP can't really figure out which samples belong to which mode; there's no way to determine that precisely.  The best you could probably do is assign each individual to the closest mode, but you are likely to get alot of them wrong.  If you can look at samples that come from different parts of the distribution and see if you can figure out how they are different.  We sometimes see multiple modes in our strength testing and the modes align with different failure modes.

 

It might also help to look at a control chart.  If the different modes of your distributions correspond to periods of time along the X-axis, you may be able to find a root cause.

-- Cameron Willden
ss2980

Occasional Contributor

Joined:

Jul 11, 2018

What could be the justification of shrinking the spec limit in the following case where data is heavily skewed?

Impu1.JPG

 

Basically most of the data is at 109 which is the LOQ of the assay. 

Again the Norm 3 mix looks the best and the new CpK calc with Norm 3 is lower at 3.1.

impu1 norm3.JPG

 

Is there a way to deal with the near LOQ data? Is the CpK calc this way accurate?

 

Thanks for all the previous replies!

cwillden

Community Trekker

Joined:

May 1, 2017

I don't know what LOQ means.  I'm not sure I could be of much help with these questions.

-- Cameron Willden
ss2980

Occasional Contributor

Joined:

Jul 11, 2018

LOQ is limit of quantitation. Basically thats the lowest value that an assay can reliably report. 

bill_worley

Staff

Joined:

Jul 2, 2014

You have chosen Normal 3 mixtures for your distribution, but SHASH and Johnson Su are better choices based on their lower AICc scores.  The lower the AICc the better.

The SHASH distribution is also known as the sinh-arcsinh distribution. Thisdistribution is similar to Johnson distributions in that it is a transformation to normality, but the SHASH distribution includes the normal distribution as a special case. This distribution can be symmetric or asymmetric.  

Also, you have two outliers one is way below your LOQ and the other way above and beyond your USL.  You may want to hide and exclude those points and refit your distributions to see how much those points leverage the overall fit and Cpk.