turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- f-test

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 3, 2010 1:05 AM
(1728 views)

How do I do an f-test in jmp?

8 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 3, 2010 4:59 AM
(1227 views)

The 1 and 10000 are my numerator and denominator degrees of freedom for the F-test; an F-test on (1 and N) df is equivalent to a t-test on the square root of the F statistic on N df (but note that the F-test is one-tailed, whereas the t-test is two-tailed - and only one tail is reported in the result), and if N is large enough, the t-test approximates to a Normal test.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 6, 2010 6:25 AM
(1227 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 6, 2010 1:28 PM
(1227 views)

Thanks for the responses.

I want to compare two variances. I ran the following script:

F Distribution(1.14, 581, 568)

The value of "1.14" being the ratio of the squares of the two variances. It returned a p-value of 0.941. From this I conclude that the variances are not statistically different at the 95% confidence level.

Is this right? I'm a beginner with regards to statistics (and jmp), so forgive me if this seems like a very basic question.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 7, 2010 7:26 AM
(1227 views)

You started along the right path, but there are a few things that are incorrect in your last post.

1. You said 1.14 is the ratio of the square of two variances. I think I know what you meant to say, but want to make sure. The statistic to use is the ratio of the square of the std dev's. In other words, the ratio of the variances, not the ratio of the square of the variances.

2. Your script was almost correct. If the ratio of variances is 1.14, and the degrees of freedom are 581 and 568 (the df are n1-1 and n2-1), then the p-value for a two-sided test is found by

2*(1-F Distribution(1.14, 581, 568)) = 0.1168

Alternatively, you can compare 1.14 to an F quantile. If you use an alpha level of 0.05, then the quantile is found by

F Quantile(.975,581,568) = 1.18. The 0.975 = 1-(0.05/2).

Since the p-value>.05 and 1.14 < 1.18, we can conclude equal variances at the 0.05 level.

If you have the raw data, you can feed it into Fit Y by X and do the F-test.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 8, 2010 7:56 AM
(1227 views)

Could you tell us what the data actually is, please? You've got two variances, and even though either could theoretically be larger than the other one, it might not make sense to test both alternatives. For example, if you had the results of 10 types of chemical reaction, each of them replicated two or three times, and you wanted to know whether there was a difference between the types of reaction, you would only test to see whether the

Also, is there anything that ties pairs, or groups, of observations together here? The note above about feeding it through the Fit Y by X procedure would only apply if there were - and even then, that would tell you something about the relationship between the two sets of observations, not whether one of them was more variable than the other. So really we need to know what the structure of the data actually

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 8, 2010 8:37 AM
(1227 views)

Use Fit Y by X, Oneway. You can use the Unequal Variances command to test for equal variances. If there are two groups of data, then that produces the F test I described above. I was not referring to the ANOVA F test for comparing means.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 8, 2010 12:36 PM
(1227 views)

Yes, that's what I meant. Thanks for the correction and for the correct script.

Here's what I'm trying to do:

I built a model using the fit model platform using data from one process run from one process tool. I built this model from half, selected randomly, of my data set. As a way of validating the model, I used the resulting prediction formula to predict the other half of the data set. I obtained two sets of residuals from the two subsets of data. I want to test if the two distributions of residuals are equivalent by doing an f-test (for equal variances) and a t-test (for equal means).

I performed the Fit Y by X, Oneway, Unequal Variances command as suggested by jg. For the 2-sided F-test, it gave a p-value of 0.1139, which is different from the p-value of 0.1168 given in jg's post. Should they be the same? Did I do something wrong?

Let me know if this is not enough background info to give the gist of what I'm trying to do.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Apr 9, 2010 10:18 AM
(1227 views)

Not only do you want the residuals to have similar means, but you want the means to be 0, for a good model. Also, you don't want there to be any trends over time, etc. You can use the residual plots to assess these things.

I don't know anything about the model your building, or your industry or application, but some other things you might need to consider. You said you built the model from data on one run from one tool. Would a different process run produce a wildly different model? Would a different tool produce a different model? If you're going to use this model to predict future runs across diffferent tools, you may want the model to be based on data that includes run-to-run variability and tool-to-tool variability, and any other source of variability that may be important.

I'm glad you understand and are using the concept of testing a model on independent data.