I have two possible test methods for measuring a response, and I want to determine which test method is the better measurement system. I've done a Gauge R&R study for each test method (using same parts and same operators, varying only the method used). The results of the two studies are quite similar, so I'd like to do a hypothesis test to understand whether the test method with the slightly lower R&R% is significantly better than the other test method.
The Gauge R&R/MSA platform doesn't provide a suitable option for doing this, as far as I can tell (whether I do separate analyses or include the test method type as a binary grouping factor). The only place where I can find rigorous heterogeneity of variance tests in JMP is in the Fit Y by X platform, but that won't do for this situation because there are multiple factors to consider (operator, part, operator*part, and test method type).
So my thinking on how to tackle this is as follows:
Does this sound like a valid approach? Does anyone have any other suggestions for how I could approach this?
Perhaps I'm missing something...but it sounds like you just replicated the same experimental treatment combinations across both test methods. So why can't you just create a data table just like you describe in step 1. above and then use the Unequal Variances test within the Fit Y by X platform to answer the homogeneity of variance question?
But let's back up a second: The other thing I'll bring up is you are asking a vague question, "Which test method is better?" and then selecting one and only one characteristic, homogeneity of variance, to answer this question. There are many aspects of evaluating a measurement system and 'better' could be evaluated by any one of these singly or as a subset.
For example...the two systems I could envision a scenario where there was homogeneity of variance wrt to each other...the only thing you appear to be interested in...and bias from either each other, or a standard. So I would be loath to say one system is better than the other solely looking at homogeneity of variance between the two systems.
Yes, the combinations were the same across both test methods, but unless at least the variation between the 3 parts is accounted for, this would mean testing for homogeneity of variance on data that is very non-normal - in fact it would be tri-modal. I don't have enough faith in the ability of any of the parametric homogeneity of variance tests to deal with extreme non-normality for this to be a valid approach, and I don't think JMP has any non-parametric equivalents.
I did try fitting response by test method type separately for each part to give 3 separate tests, but this gave a mixture of results leading to no overall conclusion.
Point taken on there being other important factors in measurement system capability - I'd already detected a small but significant bias between the two methods, but in this case there is no way to know the 'true' value so this doesn't really help!
I have been following this discussion and Peter makes some good points. The hard part of this and other similar discussions is that it often takes "seeing the data" and really understanding the questions to be answered to provide guidance for analysis.
However, JMP does have a non-parametric test for variances. In the faithful fit y-by-x platform when you have a categorical x and a continuous y there is an "analysis of means" menu option. Under this option you will find "ANOM for Variances-Levene (ADM)". You can look this up in the JMP books for details.
This might not be what you need but wanted to point out its existence.
Have you tried using Wheeler's EMP method to analyze the results? The Range chart will give you part to part mean variation source agnostic view of the test-retest error for the two methods, which it sounds like that's your biggest concern here? It's not a hypothesis test...but if there is a difference it should show up there too. I'm not a big fan of hypothesis tests in general because they tend to make the world look binary...I'd rather look at a plot of the data and draw my conclusions by interpreting the plots, in the context of the practical decisions that need to be made.
There are no labels assigned to this post.