cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • We’re improving the Learn JMP page, and want your feedback! Take the survey
  • JMP monthly Newswire gives user tips and learning events. Subscribe
Choose Language Hide Translation Bar
aatw
Level III

DoE Color map on correlations for multi-level categorical factors

Hi,

 

I am trying to understand the Color map on Correlations in DoE Evaluation when having categorical factors (like for example: "Solution") with many levels (e.g. "Distilled water", "Tap water", "NaCl").

The Color map on Correlations outputs Solution 1 and Solution 2 in the top label section, but what does this mean? Is Solution 1 the correlation between Distilled water and Tap water (as they appear in rows), or between Distilled water and NaCl (alphabetically)? And Solution 2 is then what? Also, when changing Value Order property in column, the Color map on Correlations changes, but it changes both the shape and the absolute values of correlations. I am a bit confused how to interpret the values.

 

Can somebody explain that on an example?

 

Many thanks

3 REPLIES 3
Victor_G
Super User

Re: DoE Color map on correlations for multi-level categorical factors

Hi @aatw,

 

The ordering of your nominal level for the categorical factor "Solution" is the same as the order you have specified when naming the levels 1, 2, 3 when defining the factors in your DoE. You can check this by creating the design and looking at the column property "Value Order" :

Victor_G_0-1748002698181.png

So if you have defined your solution levels in this order : "Distilled water", "Tap water", "NaCl", then these values correspond to Solution 1, Solution 2, Solution 3.

You can move your mouse to each terms intersection to better understand to which terms the correlation value is associated. If you look at the example below for the second row/first column or first row second column (same info), you'll see a correlation coefficient value of 0 for the terms Solution 1 x Solution 2, corresponding to Distilled water x Tap water :

Victor_G_1-1748003137411.png

If you change the value order property, it will of course change the correlation matrix, as Solution 1 will no longer be linked to Distilled water, but to the new level you have put in first place.

 

Hope this will help you,

 

 

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
aatw
Level III

Re: DoE Color map on correlations for multi-level categorical factors

Many thanks Victor.

I miss the correlation between Tap water and NaCl (Solution 2 x Solution 3)? Regards
Victor_G
Super User

Re: DoE Color map on correlations for multi-level categorical factors

I understand your confusion.

 

What's going on behind the scenes of this correlation map for your categorical factor is very similar to parameter estimates found by default in your least squares model : you only see values for your n-1 levels of your n-levels categorical factors. This is due to the nominal effect coding : If your nominal column has n levels, then n–1 of these indicator variables are needed to represent it. (The need for n–1 indicator variables relates directly to the fact that the main effect associated with the nominal column has n–1 degrees of freedom). In simple terms, the positions of your last level of the nominal factor in your experimental design is imposed by the positions of the n-1 levels of this factor. More infos here : Nominal Factors

 

Some technical details about the transformation of this 3-levels categorical factor into 2 continuous independant and orthonormal factors (vectors) can be found here : 

 Gram–Schmidt process - Wikipedia

9.5: The Gram-Schmidt Orthogonalization procedure - Mathematics LibreTexts

The Gram-Schmidt Orthogonalization procedure is needed to make sure the 2 independant vectors related to Solution factor are orthonormal (= one unit length). This "scaling" ensures that the lengths of all factors in your design are the same, so the comparison of effects during model building is done on the same basis, there are no bias due to high cardinality of categorical factors that would create very high vectors length and make these factors potentially more (or less) important than they are.

 

When building your design, you can check the option "Save X Matrix" to see how your 3-levels categorical factor is transformed into a set of 2 orthonormal independant factors (Solution1 and Solution2). I have modified the script of the "Model Matrix" initially saved in the attached table to make it appear as a data table with the same factors name as the ones you have in your correlation map when evaluating your design.

 

Hope this will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Recommended Articles