Discussions

TomR · Jun 10, 2023 4:32 PM

Greetings,

I'm having trouble understanding JMP's coding of a simple 2-level factorial design. The (-1,1) levels of the coding table created by Fit Model (Standard Least Squares, Effect Screening) appear reversed from the design table Pattern and associated column Value Orders.

I've attached an example design with factor levels named to keep track of low and high levels (Full Factorial-Nom.jmp). Three factors are character/nominal and the one is numeric/nominal just to see if it is treated differently. The design pattern column correctly identifies the levels, and the Value Order property for each column is also correct. So far so good...

After fitting a main effects model, the levels of the coding table are reversed. I've attached the result with the original pattern column pasted in for reference (Coding Table-Nom.jmp). I know that the model will be mathematically correct with signs reversed but I find it to be a nuisance in practice, especially on larger and more complex fractional designs where I often use coded levels for diagnostics, exporting to other software, etc.

I note the coding attribute in the data table is inaccessible for nominal column types. Why is that? I can reverse the column value orders to code correctly but then, of course, plotting levels are reversed. Am I missing something obvious here? I've looked at documentation and other threads before posting.

Thanks for any insights -

- Tom

statman · Jun 22, 2021 03:21 PM

Sorry, I don't think I understand your question. Desired coding of the factors is equi-distant centered on zero. This historically accounts for the likelihood that parameter estimates will need to "compensate" for differences in level values thus making it difficult to compare the estimates. Coding "normalizes" the parameter estimates. Now, it is done in the background whether you code or not. Coding of categorical factors (Joe and Mike for example) is arbitrary.

What I don't understand is when you analyze the dataset using fit model, you don't change the dataset whatsoever. You don't get a new table when you do the analysis. So I don't understand how or why you created the second dataset?

"All models are wrong, some are useful" G.E.P. Box

TomR · Jun 22, 2021 03:58 PM

Thanks. Actually, I'm not questioning the need or reasons for coding, or the different parameterizations that are used for various contrasts. I'm after something simpler since I often use the model coding as well as the named factor table. For the two-level case using the sum-to-zero contrasts, why doesn't JMP use the intuitive (-1) for the "lower" factor levels and (+1) for the "higher" levels that are defined in the design data table? And is there a way to force that?

Thanks,

- Tom

statman · Jun 22, 2021 1:08 PM

Sorry, I'll have to let someone else answer as I am still confused by the question. Which is higher Joe or Mike? Which is higher green or red? These are arbitrary, there is no logical ordering to the coding. Are you having difficulty interpreting the parameter estimates? You can specify what level is which in the design of the experiment when you add the factors.

"All models are wrong, some are useful" G.E.P. Box

TomR · Jun 22, 2021 2:08 PM

@statman wrote:
Sorry, I'll have to let someone else answer as I am still confused by the question. Which is higher Joe or Mike? Which is higher green or red? These are arbitrary, there is no logical ordering to the coding. Are you having difficulty interpreting the parameter estimates? You can specify what level is which in the design of the experiment when you add the factors.

Just to clarify for anyone who'd like to look at this. As statman notes, I did specify the factor levels when adding the factors and creating the design. The "Pattern" column JMP created confirms that these levels were established. I also used "Low" and "High" in the factor level names to make them "non-arbitrary" for the purpose of interpreting this example. So the design is fine for my purpose. See attached design table in opening post. My question is not about the design, but rather about why the modeling platform codes it in a way that is inconsistent with the design levels as established. That is, why are the coding matrix (-1,1) signs reversed from the design?

- Tom

Dan_Obermiller · Jun 22, 2021 6:00 PM

I believe I understand what you are saying. Whenever there is a nominal factor, one of the levels needs to be considered the "base" level. By convention, it is always the last level listed (the highest level in this case). This would lead to the parameter estimates being shown for the low level. Your Full Facotrial-Nom table is not exhibiting that behavior because one of the column properties in the table is Value Ordering. You have changed that to be reversed for all of your factors. So your "base" level is now the low level, not the high level. This is NOT the default ordering that JMP provides.

If you re-run your DOE Dialog script and make a new table, look at the Value Ordering. You will see your low level listed first, high level is second. Now fill in your response data and re-analyze, I think that you will see the coding as you wish it to be.

Dan Obermiller

TomR · Jun 22, 2021 10:38 PM

Thanks, Dan - Your response is getting to the heart of the question, describing the coding convention for nominal factors. I'm understanding what you wrote to mean that the last level listed will be coded to be the "base level" = +1.

I'm sorry I mistakenly uploaded the design table after I'd reversed the Value Order for X1 to see what would happen. Just to start fresh, here's a new design table attached with Value Order for all factors set as you described: low level listed first, high level is second. See below for the default Value Order dialog for X1 after re-running the DOE creation. The other factor levels are also listed in the same way. When I re-analyze I still do get coding reversed from the original design pattern. Updated coding table also attached.

Please let me know if I'm misinterpreting something here-

Dan_Obermiller · Jun 23, 2021 2:04 PM

With the revised files, you are using JMP's default coding. The low level will be displayed in the parameter estimates. The high level parameter estimate will be the opposite value of the low level parameter estimate since you only have two levels. One way to see this is to choose the red popup menu and select Estimates > Show Prediction Expression. I only copied the main effects.

Remember that when you use a Nominal modeling type, they are truly just names. It really is irrelevant on which value is the "low" setting and which one is the "high" setting.

If your factors are numeric, continuous, the coding formula is X-Midrange(X)/( Range(x)/2) and is NOT the same as how coding is done for nominal factors. Since you are using nominal, the formula is not and can not be used. The details on what JMP does can be found in the Fitting Linear Models > Standard Least Squares part of the JMP Documentation. I have included part of that explanation in the attachment.

Dan Obermiller

TomR · Jun 24, 2021 10:04 AM

What you've posted makes sense, Dan. Helpful to see the Show Prediction Estimates tool as well. I'll let it rest there. Thanks for your patience!

- Tom

Discussions

Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Re: Controlling nominal 2-level factor coding

Recommended Articles