cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Get the free JMP Student Edition for qualified students and instructors at degree granting institutions.
Choose Language Hide Translation Bar
View Original Published Thread

Differences in GRR results between JMP and MINTAB

Xinghua
Level III

This problem has caused a long-term confusion and has not been fundamentally solved.


0. The results of the variance analysis are exactly the same, except that the results of the variance components are different. However, MINITAB will perform two variance analyses (with and without interaction).

01.jpg


1. The Minitab's help has the following frmulas for the calculation method of variance components, but it does not explain what method it is, EMS or REML. According to my verification, when a negative value appears, it will be forced to 0. The JMP's help does not explain the specific calculation method of variance components.

 

02.jpg
2. In JMP, there will be two variance component results as shown in the figure below, and the results are also different. I don’t know why.

 

03.jpg


3. One thing is certain, the results of MINITAB are consistent with those of the MSA manual.

3 ACCEPTED SOLUTIONS

Accepted Solutions
MRB3855
Super User


Re: Differences in GRR results between JMP and MINTAB

Hi @Xinghua : I'm not so sure I'd label these kinds of things as "mistakes". While the AIAG MSA manual is a common reference for these kinds of problems, it should not be viewed as correct to the exclusion of other methods and other ways of approaching the problem. And a case could be made either way whether to include the interaction, or not, if it is not "significant".

 

That said, I also recognize your dilemma; how do you respond to customers when they ask about inconsistent results? The EMS method is an older method than REML etc; it can be implemented without any sort of sophisticated numerical algorithms. i.e., there are formulas. It is not the "best" method nor is it the only method. It is, however, simple and fit-for-purpose (hence its appeal in such manuals). To include or not include the interaction term? Good question. I don't want to venture too far down that rabbit hole. But the case to not include it is straight-forward; if the interaction effect is negligible (as could be inferred from being non-significant), why include it? On the other hand, if we think of variance components analysis as a estimation problem (rather than a testing for significance problem), then a case can be made for including the interaction (regardless of significance).

View solution in original post

awelsh
Level III


Re: Differences in GRR results between JMP and MINTAB

There's multiple ways to approach any problem. Ask 10 different people and you'll get 10 different answers, none of them wrong. (I suppose unless the person you're asking is just uninformed.)

 

For cases like this you need to use the method your customer is asking you to use. Simple as that. If they want MSA manual method give that to them. If they want Minitab give that. If they want JMP give that. If they have some excel template just use that. Etc.

 

Now for your own internal approval of measurement systems where you get to set the method then just choose the one your most familiar with. If a measurement system is good it should pass all methods.

 

Personally I like Wheeler's approaches and a control chart way to assess measurement systems. It's straight forward and easy to understand and apply. All these enumerative statistics create confusion and wasted debate as you've demonstrated.

 

Good luck, I love the community here and it seems some good detective work has been completed to compare all the methods. A significant interaction effect means the MSA fails anyway. We would never want the measurement of the part to depend on who took the measurement. So if the difference between JMP and Minitab is only caused with a significant interaction effect, then it doesn't matter what the p-value or sum of squares numbers are. MSA fails, end of story. Do some improvements to the method so we get the same results from all operators and then redo the MSA to confirm it now passes without a significant interaction.

View solution in original post

Victor_G
Super User


Re: Differences in GRR results between JMP and MINTAB

Hi @Xinghua and @MRB3855,

 

Sorry to chime in, but I wanted to provide some comments about the discussion here :

  1. IMHO, the "choice" to analyze your experiments with or without interactions (with "operator*part" interaction for example) is imposed by how you have designed your experiments. If you have generated a DoE for your MSA study where 2-factors interactions are included in the assumed model, you should analyze your MSA results with this interaction, no matter if this interaction is statistically significant or not.
    It's not a question of model building/refinement, it's a question of assessing the variance sources based on a specified set of experiments that answer a specific need.
  2. Differentiate statistical significance from practical significance ; in this MSA situation you're investigating the relative importance of variance sources like repeatability, reproducibility... A variance source that is not statistically significant can still be practically significant ! Evaluation require domain expertise related to the equipment studied to assess, evaluate and validate the repartition of variance sources and relative size/magnitude of these variances.
  3. Communication and visualization are key. As mentioned by @awelsh, if you have to communicate your findings (and particularly to non-statisticians/data scientists), I'm not sure providing raw analysis results make sense for them (and comparing results between two softwares at the decimal place is irrelevant for them). The Average and Range/StdDev charts are helpful to detect problems related to repeatability and reproducibility (if any) :
    Victor_G_1-1740666153243.png
    What the stakeholders want to know are the practical outcomes from your analysis : what are the main sources of variation ? If the equipment/response is highly noisy, on which aspect (repeatability/reproducibility) do they have to focus their effort ? If they have several protocols/responses, how are they performing relative to each others ?
    On this last point, I also like very much Wheeler classes like @awelsh, because with visualization you can very quickly show and rank different responses based on their "information quality" (example here on an anonymized real use case) :
    Victor_G_0-1740665521389.png
    I find it a lot clearer to understand, compare and rank the different responses.
  4. MSA is a collection of good practices and guidelines, but depending on the context and industry settings, you may have different norms, guidelines or restrictions to respect. Engage with domain experts and stakeholders to adapt your methodology to the real environment conditions. The selection (and number) of the different parts is particularly crucial, as you can end up with a failed MSA study if your batches/parts are too close/similar from each others and not representative of the population, because the variance related to reproducibility and repeatability can appear "inflated" compared to the part-to-part variance component. 

 

These points won't answer your direct questions, but I found them pragmatic and rational when dealing with MSA studies. I'm sure @statman would also have a lot to say.

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

12 REPLIES 12
jthi
Super User


Re: Differences in GRR results between JMP and MINTAB

Have you read through JMP documentation?

There are also community posts about this topic:

You can also force JMP to have similar results as Minitab but you have to partially script it yourself (force EMS to be used, dropping interaction...). 

-Jarmo
Xinghua
Level III


Re: Differences in GRR results between JMP and MINTAB

Thank you, I have read these.
I know there are three methods: EMS, REML and Bayes. According to the help of JMP, if the data is balanced and there are no negative values, the EMS method will be used. But it still cannot get the same result as MINITAB. You can try it.

MRB3855
Super User


Re: Differences in GRR results between JMP and MINTAB

Hi @Xinghua : not sure if I'm answering your question.

 

1. The formulae you see from Minitab is EMS. And there is no closed form formulae for REML; the REML solution is found numerically.

 

2. The Var Comp for Gage R&R section in JMP is based on the sum of Variance Components in the Variance Components in the section above.

 

Repeatability = Within

Reproducibility = Inspector + Inspector*Samples

Part to Part = Samples

And Gage R&R = Repeatability + Reproducibility

 

How are the Var Comps in Minitab different than JMP?

 

Xinghua
Level III


Re: Differences in GRR results between JMP and MINTAB

 

Thank you very much. I am not a professional statistician. I just want to know how to answer customers when they ask about inconsistent results. The algorithm of MINITAB is consistent with the AIAG MSA manual.
In addition, according to the JMP help description, when the data is balanced and the calculated variance components are not negative, the EMS method will be used. But according to my verification, the results of MINITAB and JMP are always different. Generally, we do GRR with 3 people * 3 times * 10 samples.

 

2025-02-26_085620.jpg

MRB3855
Super User


Re: Differences in GRR results between JMP and MINTAB

Hi @Xinghua : If I use the "2 Factors Crossed" data set in the Variability Data folder in the JMP Sample Data Folder, JMP gives the exact same results as Minitab.

Xinghua
Level III


Re: Differences in GRR results between JMP and MINTAB

Thank you. I checked it out, and they are exactly the same as you said. I noticed that its interaction effect is significant (inspectors * parts). In actual process, I have never seen significant in interaction effect.


This may be the root cause of the difference, because if the interaction effect is not significant in MINITAB, there will be another type calculation methods. I will continue to check it.

 

111.jpg

Xinghua
Level III


Re: Differences in GRR results between JMP and MINTAB

I verified it again. Its interaction effect is not significant.
After forcing the interaction effect to be included in the variance component (even if the interaction effect is not significant in the ANOVA), the results of JMP and EXCEL are exactly the same.
This may be a mistake of MINITAB, because the AIAG MSA manual does not tion "When the interaction effect is not significant, how to...".

 

555.jpg

MRB3855
Super User


Re: Differences in GRR results between JMP and MINTAB

Hi @Xinghua : I'm not so sure I'd label these kinds of things as "mistakes". While the AIAG MSA manual is a common reference for these kinds of problems, it should not be viewed as correct to the exclusion of other methods and other ways of approaching the problem. And a case could be made either way whether to include the interaction, or not, if it is not "significant".

 

That said, I also recognize your dilemma; how do you respond to customers when they ask about inconsistent results? The EMS method is an older method than REML etc; it can be implemented without any sort of sophisticated numerical algorithms. i.e., there are formulas. It is not the "best" method nor is it the only method. It is, however, simple and fit-for-purpose (hence its appeal in such manuals). To include or not include the interaction term? Good question. I don't want to venture too far down that rabbit hole. But the case to not include it is straight-forward; if the interaction effect is negligible (as could be inferred from being non-significant), why include it? On the other hand, if we think of variance components analysis as a estimation problem (rather than a testing for significance problem), then a case can be made for including the interaction (regardless of significance).

Victor_G
Super User


Re: Differences in GRR results between JMP and MINTAB

Hi @Xinghua and @MRB3855,

 

Sorry to chime in, but I wanted to provide some comments about the discussion here :

  1. IMHO, the "choice" to analyze your experiments with or without interactions (with "operator*part" interaction for example) is imposed by how you have designed your experiments. If you have generated a DoE for your MSA study where 2-factors interactions are included in the assumed model, you should analyze your MSA results with this interaction, no matter if this interaction is statistically significant or not.
    It's not a question of model building/refinement, it's a question of assessing the variance sources based on a specified set of experiments that answer a specific need.
  2. Differentiate statistical significance from practical significance ; in this MSA situation you're investigating the relative importance of variance sources like repeatability, reproducibility... A variance source that is not statistically significant can still be practically significant ! Evaluation require domain expertise related to the equipment studied to assess, evaluate and validate the repartition of variance sources and relative size/magnitude of these variances.
  3. Communication and visualization are key. As mentioned by @awelsh, if you have to communicate your findings (and particularly to non-statisticians/data scientists), I'm not sure providing raw analysis results make sense for them (and comparing results between two softwares at the decimal place is irrelevant for them). The Average and Range/StdDev charts are helpful to detect problems related to repeatability and reproducibility (if any) :
    Victor_G_1-1740666153243.png
    What the stakeholders want to know are the practical outcomes from your analysis : what are the main sources of variation ? If the equipment/response is highly noisy, on which aspect (repeatability/reproducibility) do they have to focus their effort ? If they have several protocols/responses, how are they performing relative to each others ?
    On this last point, I also like very much Wheeler classes like @awelsh, because with visualization you can very quickly show and rank different responses based on their "information quality" (example here on an anonymized real use case) :
    Victor_G_0-1740665521389.png
    I find it a lot clearer to understand, compare and rank the different responses.
  4. MSA is a collection of good practices and guidelines, but depending on the context and industry settings, you may have different norms, guidelines or restrictions to respect. Engage with domain experts and stakeholders to adapt your methodology to the real environment conditions. The selection (and number) of the different parts is particularly crucial, as you can end up with a failed MSA study if your batches/parts are too close/similar from each others and not representative of the population, because the variance related to reproducibility and repeatability can appear "inflated" compared to the part-to-part variance component. 

 

These points won't answer your direct questions, but I found them pragmatic and rational when dealing with MSA studies. I'm sure @statman would also have a lot to say.

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)