JMP User Community
- :
- Discussions
- :
Color Map on Correlations creation

Sep 11, 2013 1:52 PM
(2101 views)

Oct 2, 2013 6:41 AM
(3561 views)

Solution

I finally was able to find a good explanation for the creation of the color map on correlations. I actually found it in the JMP documentation on fitting linear models. Fitting Linear Models.pdf > Chapter 3 Standard Least Squares Report and Options > Correlation of Estimates, page 189 in JMP 11 documentation.

Using these equations, with variance = 1, the color map on correlation plot is just the absolute value of Corr(Beta-hat)

Dec 11, 2017 12:47 PM
(724 views)

The correlations of estimates matrix is NOT the absolute value of the corr(beta-hat) matrix. Here's an example to demonstrate. Note I have added a column of 1's for the intercept term.

X = [1 0.53 -1 -1 -0.53 -0.53 1,

1 1 1 1 1 1 1,

1 -1 1 -1 -1 1 -1,

1 -1 -1 1 1 -1 -1,

1 1 1 -1 1 -1 -1,

1 -1 1 1 -1 -1 1,

1 1 -1 1 -1 1 -1];

The "color map on correlations" does not include the column of intercepts, but this is immaterial since corrleations are pairwise. Anyhow, if you put the above design into the DOE platform (less the column of 1's), or do the same and use the multivariate --> correlations option, you get the following correlations matrix among the columns of the X matrix:

[1 -0.0926 -0.0926 0.2819 0.2819 0.0926,

-0.0926 1 -0.1667 0.0926 0.0926 0.1667,

-0.0926 -0.1667 1 0.0926 0.0926 0.1667,

0.2819 0.0926 0.0926 1 -0.2819 -0.0926,

0.2819 0.0926 0.0926 -0.2819 1 -0.0926,

0.0926 0.1667 0.1667 -0.0926 -0.0926 1]

If you look into the JMP documentation regarding the correlations of estimates, you'll find that it's defined as

corr(beta-hat) := V_inv*(X’X)_inv*V_inv where V:=sqrt(diag(X’X)_inv)

Using the X matrix as above, this gives

X_t = Transpose**(**X**)**;

V = sqrt**(**diag**(**Inverse**(**X_t*X**)))**;

corr_beta_hat = Round**(**Inv**(**V**)***Inverse**(**X_t*X**)***Inv**(**V**)**,4**)**;

**[1 -0.3101 -0.3154 -0.3154 0.3101 0.3101 0.3154,**

**-0.3101 1 0.3101 0.3101 -0.5 -0.5 -0.3101,**

**-0.3154 0.3101 1 0.3154 -0.3101 -0.3101 -0.3154,**

**-0.3154 0.3101 0.3154 1 -0.3101 -0.3101 -0.3154,**

**0.3101 -0.5 -0.3101 -0.3101 1 0.5 0.3101,**

**0.3101 -0.5 -0.3101 -0.3101 0.5 1 0.3101,**

**0.3154 -0.3101 -0.3154 -0.3154 0.3101 0.3101 1]**

Which is exactly what you get when you use the "fit model" command in JMP. Note that the values in this matrix are not dependent on the values of either the response or the MSE.

So my question is this: if one were trying to decide if the amount of confounding among model terms (or estimates) was acceptable, should one review the correlations among the columns of the design matrix or should one review the correlations among the beta-hats? Why might one be better than the other? Is there an intuitive explanation describing the relationship between the two?

Dec 12, 2017 4:13 AM
(708 views)

The correlation among the parameter estimates is the correlation that really matters. This quantity determines the inflation of the standard errors of these estimates. It reduces the power of your tests. It widens your confidence intervals. The correlation between the factor columns in the design are merely an 'means to the end.' Eliminating this correlation will eliminate the correlation of the estimates of the main effects, for example.

Note that there is no absolute cutoff for unacceptable correlation because the tolerable VIF does depend on the effect size and the RMSE. If you have a very small RMSE compared to the effect, then you can tolerate high VIF. On the other hand, if you have a small effect compared to the RMSE, then even a small correlation might adversely affect your power.

Dec 12, 2017 9:27 AM
(695 views)

And I agree that there is no correct cutoff for correlation values. The entirety of inferential statistics involves tradeoffs, and which tradeoff to make is dependent on the application, the data, and one's personal viewpoint.