cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
JenniferB
Level I

Correct Degrees of Freedom?

Hello,

I am using the fit model function to model a continuous variable with 6 different categorical effects. However the DF listed for the model is given as 8 instead of 6 and the parameter estimates is breaking two of these categorical variables into two categories each. Why are these two variables being broken up into two categories? Are the degrees of freedom correct? I have include photos of the fit model screen and the parameter estimates screen. Thank you!

Fit Model Screen.JPG

Parameter Esimtates and DF.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
Victor_G
Super User

Re: Correct Degrees of Freedom?

Hi @JenniferB,


Welcome in the Community !
It may be a little more complex to answer you without full access on the data, but I will try my best with the informations provided by the screenshots. Please correct me if some of my assumptions about your dataset are wrong.
Let's see what's behind the degrees of freedom (DFs) calculated for your model:

 

  • DFs in your dataset : Looking at the "analysis of variance" panel, you seem to have 50 different observations in your dataset, hence 50 degrees of freedoms in total. If you have observations that are duplicate, meaning if you have observations with the same X values, these observations don't add any degree of freedoms (but they may help in estimating the response variance).
  • DFs in your model : The number of DFs "consumed" by your model depends on the type of factors you have. For example, each  numerical continuous factor uses 1 DF in the model to estimate the parameter in the equation. For categorical factor, the number of DFs used depends on the number of levels : For N levels in a categorical factor, the number of DFs used in the modeling is N-1, since the last level parameter estimate can be calculated from the others : in case of a 3-levels factor, the sum of the level estimates is L1+L2+L3 = 0.
    Example here with the equation of hot dog price depending on the type of meat, you can check that the sum of the 3-levels parameter estimates is indeed equal to 0:
    Victor_G_0-1688628113669.png

     

Coming back to your topic, you seem to have the following factors in your model:

  • Q28: Gender : 2-levels categorical factor (representing 1 DF),
  • Q31: Hometown : 3-levels categorical factor (representing 2 DFs),
  • Q32: Highest Degree: 2-levels categorical factor (representing 1 DF), 
  • Q39: EuthExp: 3-levels categorical factor (representing 2 DFs),
  • Q43: PriorTraining : 2-levels categorical factor (representing 1 DF),
  • Q44: Euth6mos : 2-levels categorical factor (representing 1 DF).

 

So the calculations in JMP are correct, you have in total 1+2+1+2+1+1 = 8 DFs used in your model, you had in your dataset 50 DFs available, so 42 DFs are left and used for error estimation, as seen in the "Analysis of Variance" panel.

I hope this answer will clarify the outputs you're seeing,

 

Victor GUILLER
Scientific Expertise Engineer
L'Oréal - Data & Analytics

View solution in original post

4 REPLIES 4
Victor_G
Super User

Re: Correct Degrees of Freedom?

Hi @JenniferB,


Welcome in the Community !
It may be a little more complex to answer you without full access on the data, but I will try my best with the informations provided by the screenshots. Please correct me if some of my assumptions about your dataset are wrong.
Let's see what's behind the degrees of freedom (DFs) calculated for your model:

 

  • DFs in your dataset : Looking at the "analysis of variance" panel, you seem to have 50 different observations in your dataset, hence 50 degrees of freedoms in total. If you have observations that are duplicate, meaning if you have observations with the same X values, these observations don't add any degree of freedoms (but they may help in estimating the response variance).
  • DFs in your model : The number of DFs "consumed" by your model depends on the type of factors you have. For example, each  numerical continuous factor uses 1 DF in the model to estimate the parameter in the equation. For categorical factor, the number of DFs used depends on the number of levels : For N levels in a categorical factor, the number of DFs used in the modeling is N-1, since the last level parameter estimate can be calculated from the others : in case of a 3-levels factor, the sum of the level estimates is L1+L2+L3 = 0.
    Example here with the equation of hot dog price depending on the type of meat, you can check that the sum of the 3-levels parameter estimates is indeed equal to 0:
    Victor_G_0-1688628113669.png

     

Coming back to your topic, you seem to have the following factors in your model:

  • Q28: Gender : 2-levels categorical factor (representing 1 DF),
  • Q31: Hometown : 3-levels categorical factor (representing 2 DFs),
  • Q32: Highest Degree: 2-levels categorical factor (representing 1 DF), 
  • Q39: EuthExp: 3-levels categorical factor (representing 2 DFs),
  • Q43: PriorTraining : 2-levels categorical factor (representing 1 DF),
  • Q44: Euth6mos : 2-levels categorical factor (representing 1 DF).

 

So the calculations in JMP are correct, you have in total 1+2+1+2+1+1 = 8 DFs used in your model, you had in your dataset 50 DFs available, so 42 DFs are left and used for error estimation, as seen in the "Analysis of Variance" panel.

I hope this answer will clarify the outputs you're seeing,

 

Victor GUILLER
Scientific Expertise Engineer
L'Oréal - Data & Analytics
JenniferB
Level I

Re: Correct Degrees of Freedom?

Yes, that makes sense! Thank you for your help!

Jen

statman
Super User

Re: Correct Degrees of Freedom?

Just one possible clarification, if you have 50 total DF's, you must have 51 total observations.

"All models are wrong, some are useful" G.E.P. Box
Victor_G
Super User

Re: Correct Degrees of Freedom?

Yes, I forgot to count 1 DF for the Intercept, good catch !

Victor GUILLER
Scientific Expertise Engineer
L'Oréal - Data & Analytics