turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Comparing results

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 4, 2011 11:31 AM
(814 views)

3 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 5, 2011 2:02 AM
(757 views)

Secondly, run an analysis on the combined data set (i.e. assuming a single flow rate parameter) and again take a note of the residual sum of squares and the degrees of freedom you get. The total residual sum of squares from the separate analyses above should be less than the figure you now get because you originally fitted a more complex version of the same model; the real question is whether it's significantly less.

To test it, work out the difference between the two, and divide that difference by the difference in the two residual degrees of freedom (which should be exactly 1, since one model has just one parameter more than the other one): that gives you the residual mean square of the difference. Compare that with the residual mean square of the more complex model (because that's the best estimate you've got of pure random variation) in an F test. If the result is significant, that means there's a significant difference between the two parameters, i.e. the two flow rates - because it means that including an additional parameter has resulted in significantly more variation being explained by the model.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 12, 2011 4:20 PM
(757 views)

Many thanks for your reply. Can I clarify on what you suggested:

> To test it, work out the difference between the two,

> and divide that difference by the difference in the

> two residual degrees of freedom (which should be

> exactly 1, since one model has just one parameter

> more than the other one): that gives you the residual

> mean square of the difference.

I assume the "difference between the two" that you mentioned means the residual sum of squares (SSE) between the combined data set analysis, and the combined SSE obtained from the separate analyses. The difference between the two residues degrees of freedom is 2, however. I'm not sure if it's because I don't have equal no. of data points for each set.

Would the results be valid if I have unequal variance between the two data sets?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

May 13, 2011 1:35 AM
(757 views)

So for example, if the underlying model you were fitting were y=A*exp(-alpha*t), and you wanted to know if the estimate of alpha was the same for two different data sets, say of sizes N1 and N2, then the simpler model would have just two parameters (A and alpha), whereas the more complex model would have three (A, alpha1 and alpha2). I'm assuming here that A is the same under both models: if it isn't, then there

You should find then that the simpler model would have 2 df for the regression, (N1+N2) df for the total sum of squares, and (N1+N2-2) for the residual. The more complex model would have 3 df for the regression, (N1+N2) df for the total, and (N1+N2-3) for the residual. (Incidentally, I haven't got an "(N-1)" df term for the total sum of squares - which is what you'd normally expect to see in an analysis of variance table for an ordinary linear regression - because I'm not fitting a constant term here. Usually only the

If the residual variances of the two data sets are clearly unequal, that'll violate the assumptions that are implicit in the ANOVA calculations, so I think I'd see if there's any normalizing transformation that could be applied to the data before it's analyzed to minimize any heterogeneity present. In the example of the model I described above (which is just an exponential decay curve), if you plotted out such a data set you'd probably find that the data became less variable as t increases, so it would make sense to log the data prior to analysis anyway. But of course that would change the model you'd want to fit, since if you log the original equation you get ln(y) = ln(A) - alpha*t, which is just a simple linear regression in ln(A) and alpha. In that instance, logging the data would make the problem a lot easier to solve, in addition to being the right thing to do if applying such a transformation has the effect of normalizing the residual variation. I don't know exactly what equation you want to fit to your data here, but it could well be that a suitably-chosen transformation could help in your situation also.

I'm sorry that's a bit long-winded, but does it help? (BTW, a good place to find an example of nonlinear optimisation being performed is the "Algae Mitscherlich" data set in the JMP online help: you'll find examples of three different models being fitted in which some of the alphas and betas are assumed to be the same across the various fits. There are 120 data points in total, of which 8 are automatically excluded, giving residual degrees of freedom of 112 minus the number of parameters fitted in each case.)