cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
dadawasozo
Level IV

How DOE analysis handle categorical factor in regression analysis

Hi,

 

I use the DOE on categorical factors. the predicted formula is linear regression. I wonder how does the categorical factor handled and so the linear regression will work. is there any recoding on categorical factors? Can someone point me to related documentation how JMP handles that?

2 REPLIES 2
Victor_G
Super User

Re: How DOE analysis handle categorical factor in regression analysis

Hi @dadawasozo,

 

There might be different encoding for categorical variables depending on which platform you're using.

 

  • For "Fit Model" (Standard Least Squares) approach : 

"When you enter a column with a nominal modeling type in the Fit Model launch window, JMP represents it internally as a set of continuous indicator variables. Each variable assumes only the values –1, 0, and 1. (Note that this coding is one of many ways to use indicator variables to code nominal variables.) If your nominal column has n levels, then n–1 of these indicator variables are needed to represent it. (The need for n–1 indicator variables relates directly to the fact that the main effect associated with the nominal column has n–1 degrees of freedom.) Full details are covered in Nominal Factors."

From : Statistical Details for Nominal Effects Coding

 

  • For "Generalized Regression" (Standard Least Squares or other estimation methods) :

"The parameterization of nominal variables used in the Generalized Regression personality differs from their parameterization using other Fit Model personalities. The Generalized Regression personality uses indicator function parameterization. In this parameterization, the estimate that corresponds to the indicator for a level of a nominal variable is an estimate of the difference between the mean response at that level and the mean response at the last level. The last level is the level with the highest value order coding; it is the level whose indicator function is not included in the model."

From :  Launch the Generalized Regression Personality

 

You might be interested in this discussion where the differences between the platforms are investigated (differences in estimates, p-values of effects, ... between Fit Model and Generalized Regression platforms) : https://community.jmp.com/t5/Discussions/Random-effect-test/m-p/659523#M84878 

 

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
dadawasozo
Level IV

Re: How DOE analysis handle categorical factor in regression analysis

Hi Victor,

 

Thanks a lot for the reply. These information definitely helpful.