Discussions

Isabel26 · Jun 8, 2023 5:44 PM

I am doing cpk analysis and did continuous fit-->fit all to find the true distribution for the data, then run cpk. JMP provided the best option, then I did Goodness of fit to confirm by P-value, however, P <0.001 which reject the distribution JMP suggested. I assume, this means JMP do not have the proper distribution stored. My question is if I can just use the distribution to run CPK, based on the lowest AICc in JMP? Or what will be the option here for me to get correct CPK? Thanks.

statman · Jan 11, 2022 04:05 PM

I am not the right person to provide answers to your questions, but I do have some food for thought:

I believe what you are trying to do is get an estimate of the variation of whatever it is you are measuring and compare that to specifications. There are many ways to do this just as there are many indices to summarize those comparisons. ALL of the indices are estimates. There is NO correct one. The purpose of analyzing the distribution of the data is to determine what summary statistics best describe the central tendency and dispersion. Just as important (or perhaps more), is to determine if the data comes from a stable system. If not, the indices will surely not be useful.

"All models are wrong, some are useful" G.E.P. Box

Isabel26 · Jan 11, 2022 04:35 PM

Totally agree. Just do not have solid support for my cpk results if people question the P value in my distribution determination.

Mark_Bailey · Jan 12, 2022 11:10 AM

Why are these distribution models, even the 'best' one, not fitting well? Please provide a picture of the histogram in Distribution with the plot of the graph of the PDF over-laid. Also, can you provide the quantile plot using the command in the red triangle for the selected fit? It might be a data or an outlier problem.

The capability indices strongly depend on the definition of the tails of the distribution, so having a good fit is important if you plan to use the estimates from the fitted CDF in the calculation.

Isabel26 · Jan 12, 2022 8:53 AM

I attached the pic in the original post. I may wrong. The attached pic showed that JMP selected SHASH as the best fit, but goodness of fit p value is <0.001. My understanding is this p reject the hypnosis of SHASH is the distribution. the overlay seems good fit and qq plots seems good too, but I don't get it why P values is so low.

Dan_Obermiller · Jan 12, 2022 12:13 PM

The pictures that @Mark_Bailey is looking for would be extremely helpful. As he stated, the small p-value could be caused by an outlier. The pictures will help determine that. Also, how many observations do you have? With enough data, any small discrepancy that is not of any practical significance could be flagged as statistically significant. Again, the pictures that Mark is asking for will help determine that, too.

Dan Obermiller

Isabel26 · Jan 12, 2022 01:54 PM

I have over 1k obs, just the natural of the data is one peak followed by a skinny tail.

Mark_Bailey · Jan 13, 2022 08:07 AM

These tests for goodness of fit are notoriously sensitive when the amount of data is large. The idea is simply that real data, in large quantities, rarely follow an idealistic distribution model. Even large quantities of data from sampling a ideal population (e.g., random number generator for a given distribution) will often find a statistical significant departure from the model, which is a type I error.

Mark_Bailey · Jan 12, 2022 12:38 PM

The plots (i.e., histogram and quantile plot) are not visible in either of the pictures that you provided.

Isabel26 · Jan 12, 2022 01:53 PM

Here is the histogram and quantile plot.

Discussions

distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Re: distribution of the data is not in JMP during CPK analysis

Recommended Articles