Discussions

Bashburz1 · Apr 3, 2025 12:53 PM

Hi all,

I fitted a simple model with one continuous and one nominal variable using Standard Least Squares. Now I am wondering why the sums of squares seem to mismatch under Analysis of Variance and Effect Tests. I am on JMP 18.0.1.

3.16 and 0.01 do not give the overall model's SS of 3.37. It's like some 0.2 are missing. Could anyone please explain why that is?

This particularly puzzles me because I am looking at analysis of the same data in minitab provided by colleagues, and there the factor "Month" indeed gets the increased SS of 0.209. All the other stuff like DF and SS of everything else matches. In the end, this has implications for the p-value to judge the significance of "month" for the fit, which is why this bothered me in the first place.

Regards,

Konstantin

statman · Apr 3, 2025 01:38 PM

It would really help to look over your results if you attach the actual data table instead of just the outputs. A little bit like "show your work" in school.

One observation I can instantly make in your comparison with Minitab is Minitab is reporting sequential SS (type 1) not partial SS (type 3). Sequential SS are order dependent, while partial SS are more robust to model order (and are preferred and usually the default). JMP will use type 3 as the default and also uses REML (I think Minitab is still on the EMS estimation platform).

I'm not sure what your factors are, and which are continuous and which are nominal. Neither batch nor month seem continuous? You have 9 batches and 2 months. but I can't tell where your DF for MSE is from?

If you look at the parameter estimates table and add the VIF column, what are those values?

"All models are wrong, some are useful" G.E.P. Box

View solution in original post

statman · Apr 3, 2025 01:38 PM

It would really help to look over your results if you attach the actual data table instead of just the outputs. A little bit like "show your work" in school.

One observation I can instantly make in your comparison with Minitab is Minitab is reporting sequential SS (type 1) not partial SS (type 3). Sequential SS are order dependent, while partial SS are more robust to model order (and are preferred and usually the default). JMP will use type 3 as the default and also uses REML (I think Minitab is still on the EMS estimation platform).

I'm not sure what your factors are, and which are continuous and which are nominal. Neither batch nor month seem continuous? You have 9 batches and 2 months. but I can't tell where your DF for MSE is from?

If you look at the parameter estimates table and add the VIF column, what are those values?

"All models are wrong, some are useful" G.E.P. Box

Bashburz1 · Apr 4, 2025 02:55 AM

Thank you very much, it is much clearer already!

I have attached the data to this reply. The VIFs are fairly small.

Your observation is very enlightening to me. Yesterday I was actually wondering if the order of the model plays a role, and switched month and batch around, but JMP did not care. Perfectly makes sense with your explanation with Type 3. I now also understand that the effects in JMP are adjusted for the presence of other effects, while minitab does not do this, which explains the difference. After removing "batch" from the model, JMP gives an SS of 0.2 for "month", which is what minitab outputs from the model with both factors.

I believe this question is resolved.

However, I have also been wondering similarly as you about the DFs. Month is a continuous factor. So why does it end up with a DF of 1? Maybe you get an idea after looking at the data.

MRB3855 · Apr 4, 2025 06:46 AM

Hi @Bashburz1 : "However, I have also been wondering similarly as you about the DFs. Month is a continuous factor. So why does it end up with a DF of 1?"

That's because as a continuous factor, it only introduces one parameter to the model; the "slope". i.e., the term that is multiplied by Month.

Bashburz1 · Apr 4, 2025 07:16 AM

Got it, thank you!

statman · Apr 4, 2025 10:48 AM

BTW, if you change the data type for month to nominal you will get 9 DF's for month. I'm not sure why you consider month a continuous variable?

Also if you click on the red triangle next to the response in the Fit Model output, select Estimates>Sequential Tests, you will get type 1 SS.

"All models are wrong, some are useful" G.E.P. Box

MRB3855 · Apr 4, 2025 1:01 PM

Hi @statman My educated guess is that “month” is indeed continuous; it is actually time since the sample was put on stability…I.e., in a chamber at a given temperature and relative humidity. But its units are months (could be seconds, minutes, days, weeks, etc). I’m guessing this is stability analysis (shelf-life) in the pharmaceutical industry. Am I correct @Bashburz1? if so, “month” is continuous.

statman · Apr 4, 2025 04:10 PM

Hmmmm, interesting, if you are concerned with stability, I wouldn't model month with a linear model. I would use sampling ideas and control charts for stability assessment. What would it mean if month was significant? Was there an unusual month (special using Deming's terminology) or is there just a lot of variation month to month?

You certainly can't complete the study and say month is significant and month 1 is better, let's run there. Now if you were to ask the question in a baking process does time have an effect, you would certainly consider that "time" a continuous factor.

But admittedly I am not a "shelf-life" SME.

"All models are wrong, some are useful" G.E.P. Box

MRB3855 · Apr 4, 2025 04:35 PM

@statman The appropriate analogy is your baking process; the linear model for “stability” (that’s the term used in the industry for what is commonly referred to as shelf-life…not to be confused with a “stable process” per QC lingo) is an ANCOVA model with time as a covariate.

Probably more than you asked for, but here ya go!
https://database.ich.org/sites/default/files/Q1E_Guideline.pdf#page13

statman · Apr 5, 2025 12:11 PM

Thanks for the light reading this morning. Wow that is obviously guidance written by committee.

"The purpose of a stability study is to establish, based

on testing a minimum of three batches of the drug substance or product, a retest

period or shelf life and label storage instructions applicable to all future batches

manufactured and packaged under similar circumstances".

Where did 3 come from? 3 batches is representative of all batches...in what world?

As you note, the use of the word stability is confusing to me and not defined in the document. It appears to be directed towards the deterioration of the drug efficacy?

I like this section:

"Data not amenable to statistical analysis"

In any case, to the OP, if you are following this guidance, month should be the covariate and you should use sequential tests on the covariate, then partial SS (type 3) for the remaining factors in your study.

"All models are wrong, some are useful" G.E.P. Box

Discussions

Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Re: Why does the Sum of Squares under "Effect Tests" not add up to the overall model's Sum of Squares?

Recommended Articles