cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
Jimvano7
Level III

Centering IVs in regression only in interaction

I have read in several places that with regression, JMP will mean center IVs that are involved in interactions but it will NOT center the simple effect versions of those IVs? First, is this true? Second, if true, how does this not violate the linear independence requirement of regression?

For example,
Y = b0 + b1x1 + b2c2 + b3x1x2

If x1x2 is made into x1'x2' because JMP centers both only for the interaction, then b1 is no longer the estimate of the effect of x1 on Y when x2 = 0 because the interaction term is no longer 0. The same is true of b2.

Given these problems, I assume I am misunderstanding what JMP is really doing. Can anyone clarify?

2 ACCEPTED SOLUTIONS

Accepted Solutions
julian
Community Manager Community Manager

Re: Centering IVs in regression only in interaction

Hi @Jimvano7, and everyone else,

I am wading into this answer a bit late so forgive me for not responding to all the pieces (or for missing some nuance), but I wanted to offer up an answer (and video) I gave on this same topic on the community about 10 years ago. I did a little demonstration in the video with the prediction profiler that I think helped make some of the estimates clear. 

https://community.jmp.com/t5/Discussions/estimates-in-multipule-regression/m-p/10965/highlight/true#...

 

And here is a direct link to that video:

https://www.youtube.com/watch?v=LLh1V9MtKvs

 

I hope this helps!

@julian 

View solution in original post

From a user community question about the effect of centering variables in multiple regression
julian
Community Manager Community Manager

Re: Centering IVs in regression only in interaction

The intercept is always some variation of where the line (or plane of regression in models with more than a single term) is on the Y axis when the other coefficients contribute nothing (i.e. are set to 0). When we center just the interaction term, "nothing" of the main effects takes on a changed meaning. Not numerically, we're always still talking about 0 of each predictor, but what that zero is pointing at in the population is changing because mean-centering the interaction shifts the zero interaction effect to the average behavior in the population, not the literal origin point (0,0) of the predictors.

 

In short, in a model like this, the intercept is more like an estimate in an analysis of covariance, an adjusted estimate based on removing, statistically, the average effect of the interaction from the plane. I don't find that explanation particularly helpful conceptually, so if you'll allow it, I'm going going to talk it through with the example I used before. 

 

I always find it helpful to see these things visually. Here's that example I used before, and let's look at the regression planes (which will be the same) for the centered polynomial model (left) and the uncentered model (right). I've added in a response grid at 50 for both (which is the intercept of the centered model). I have also put blue dots to show where the intercepts of the models are

julian_0-1750158572294.png

 

Starting on the right, the intercept has a very easy interpretation. It's the value of Y where the plane of the response crosses 0 for both X1 and X2. That is, when there is 0 of study hours and 0 of previous knowledge. Easy.  (Important for later: we aren't even thinking about the interaction term here because in this kind of model, when X1=0, and X2=0, we know that the interaction adds nothing because that b3 coefficient is being multiplied by zeros)

 

For the centered model on the left, the model intercept of 50 is well above the value when there is 0 of both Xs. But why the bump of roughly 20 exam points?

 

A score of 50 is where we have roughly 40 of Previous Knowledge and 0 Study Hours; or, where we have 0 Previous Knowledge and 4 Study Hours. Here I've toggled on the value grids so you can see them line up with the blue dots I put before:

julian_2-1750158949377.png

 

So, what gives?! We know these are not the means of Previous Knowledge and Study Hours, so it's not as simple as holding one variable constant and the other at their mean. One thing might pop out to you here: these points are a symmetric distance up the plane of response from the "true" (X1=0, and X2=0) intercept. And the only term in our model that exerts symmetric influence (in a scaled sense) on Y across the factors of X1 and X2 is b3, the interaction term.

 

What we're not accounting for yet is setting the *interaction* term, B3, to 0. And that zero happens at a different place in a model like this than where X1 and X2 are 0 (because of that centering); it happens at the means of X1 and X2, so we're talking about *average* interaction. The intercept of 50 here reflects a kind of adjusted baseline: it's what we would get at (X1 = 0 or X2 = 0) if there were no interaction effect in the population. Conceptually, an estimate the intercept adjusted for the presence of the interaction.

 

To me, this term resists a conceptual interpretation quite a bit more than any typical intercept but here's how I would frame it in this case: With the negative coefficient for the interaction term, we know that these factors are interacting antagonistically (more of one decreases the strength of the relationship between the response Y, and the other factor). That is, the more people know ahead of time, the less they get value from studying on average. Or, the more people study, the less on average they get value from how much they knew. The intercept in this model is trying to tell us what exam scores would be like *if that were not the case.* If that interaction weren't the state of the world we measured, then people who studied 0 hours would have had more value from their previous knowledge, and so they would do better on the exam, a bump up from an intercept of 30 to 50. And if that interaction weren't the state of the world we measured, then people who had 0 previous knowledge would have had more value from their studying, hence that same bump up of the intercept from 30 to 50. Like an ANCOVA, this is a statistical "as if" thought experiment.

 

I hope this helps!

 

Jules

 

View solution in original post

19 REPLIES 19
Victor_G
Super User

Re: Centering IVs in rrgression

Hi @Jimvano7,

 

Welcome in the Community !

 

Centering polynomials or interaction effects will shift the intercept value, and change the coefficients values.
In your example, without centering, your intercept is b0.
If you have centered X1 and X2 in the interaction term, then the "new" intercept corresponds to b0 + the part contained in the interaction term with the mean values of X1 and X2 and coefficient b'3.

 

See in Why does JMP® center polynomials in models by default? the reasons behind the centering in JMP : Centering factors help reduce multicollinearity in the presence of interaction terms or polynomial terms in the model, which could make the terms coefficients more complex and less precise to estimate (and could lead to differences in statistical significance evaluation). See Stepwise model question for a practical example.

And previous discussions Centering polynomials calculation and Intercept of a parabola for more infos.  

 

Hope this will clarify the situation,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Jimvano7
Level III

Re: Centering IVs in rrgression

Hi @Victor_G ,

 

Thank you for responding and I apologize for not being clear.  I am not asking why one would mean-center variables. I do this routinely.

 

My two questions are:

1) Does JMP mean center IVs that are involved in interactions and *NOT* mean center the simple effect versions of those IVs? So, using my example, X1 as the simple effect and X1' in the interaction.
2) And if it does so as I read it did, what are the meanings of b1 and b2 since they are no longer the effect of x1 on Y and the effect of x2 on Y, when the other variable is 0, respectively?

 

Thanks,

Jim

Victor_G
Super User

Re: Centering IVs in rrgression

Hi @Jimvano7,

 

Nothing best than a practical example to see how JMP works.
I have prepared a dataset with two factors X1 and X2, a response with predetermined response surface equation Y, and two calculated columns for centered X1 and centered X2.

 

When launching the model fitting with original variables and a response surface model, you can see that JMP does not center original variables, but do it with variables involved in interaction or polynomial terms :

Victor_G_0-1749714569945.png

 

When launching the model fitting with centered variables and a response surface model, you can see that parameter estimates are the same between Xi and centered Xi, intercept is different but parameter estimates are the same as before :

Victor_G_1-1749714745985.png

 

Centering variables doesn't change the coefficients of the corresponding main effect estimates (the "slope" is the same wether you're centering the variable or not), so the interpretation stays the same.

So I think this use case and demonstration answer your 2 questions (I'm not sure tu understand the problem with your 2nd question) ?

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Jimvano7
Level III

Re: Centering IVs in rrgression

Hi @Victor_G,

 

Thank you for answering my first question and for creating the example.  The problem with your example is that you have the JMP mean centering option turned on when you included X1X2 so that JMP mean centers.  Turn it off and you get a very different result. Here are data from your dataset. X1, X2, and Y are all continuous variables. X1 has a mean of 9.886 and X2 has a mean of 1.303.

 

Version 1. With the JMP mean centering option turned off and raw IVs you get:

Term

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

164.198

36.29445

4.52

0.0001*

X1

 -50.64385

3.103592

 -16.32

<.0001*

X2

 -14.74154

8.796501

 -1.68

0.1058

X1*X2

 -6.260865

0.726168

 -8.62

<.0001*

 

Version 2. With JMP mean centering option turned on and with raw simple effect terms and mean-centered interaction terms, we get

Term

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

83.558404

35.53908

2.35

0.0266*

X1

 -42.48702

2.977176

 -14.27

<.0001*

X2

 -76.63734

4.675518

 -16.39

<.0001*

(X1-9.88614)*(X2+1.30283)

 -6.260865

0.726168

 -8.62

<.0001*

 

Version 3. With your manually mean-centered variables for all variables and with JMP mean centering turned off.

Term

Estimate

Std Error

t Ratio

Prob>|t|

Intercept

 -236.6291

19.63316

 -12.05

<.0001*

Centered X1

 -42.48702

2.977176

 -14.27

<.0001*

Centered X2

 -76.63734

4.675518

 -16.39

<.0001*

Centered X1*Centered X2

 -6.260865

0.726168

 -8.62

<.0001*

 

As you can see, b0, b1, and b2 all take on different values between Version 1 and Version 2. 

 

Version 1: Because this equation has an interaction term, the meaning of b1 (-50.644), and the statistical test that is performed to determine the significance of b1 (H0 = 0) is related to the influence of X1 on Y, **when X2=0**. X1 is not a main effect, it is a simple effect - conditional on when X2 is 0 (Jaccard & Turissi, 2003). Similarly, b2 is the influence of X2 on Y when X1 = 0.

 

Now, when X2 in the simple effect term is raw and X2 in the interaction term is raw, then the math works perfectly. b1 is the effect of X1 on Y when X2 is 0. Therefore, b2X2 = 0 and the interaction = 0.  We are left with Y = 164.198 - 50.644X1 when X2 = 0.

 

Version 3: If we mean center X1 as X1' and X2 as X2' for both the simple effect term and the interaction term, then the math works perfectly again.  b1 is the effect of X1' on Y when X2' is 0 (when X2 is 1.303). Therefore, b2X2' = 0 and the interaction = 0. We are left with Y = -236.629 + (-42.487)X1' when X2' = 0 which is also to say when X2 = 1.303.

 

Version 2: If we keep X1 and X2 raw for the simple effect terms and we mean center the interaction term using X1' and X2', then b1 is no longer the effect of X1 on Y when X2 is 0, because in this case we are left with Y = b0 + b1X1 + b3X1'X2' or Y = 83.558 + (-42.487)X1 + (-6.261)(-1.303)X1'. In this case, the slope changes from 50.644X1 to  (-42.487)X1 + (8.158)X1'. My issue is: What are the interpretations of b0, b1, and b2 in this version? 

 

And, since X1 perfectly predicts X1', we have perfect multicollinearity so why is JMP resolving the model in Version 2 without an error?

 

Victor_G
Super User

Re: Centering IVs in rrgression

Hi @Jimvano7,

 

Sorry, I indeed forgot to turn off mean centering option in my model tests.
I'm still confused by some of your remarks, like :

And, since X1 perfectly predicts X1', we have perfect multicollinearity so why is JMP resolving the model in Version 2 without an error?

X1 and X'1 are variables in the model and only a slight transformation is used to go from one to another. You would have the same situation with X1 and X1, or X'1 and X'1, so I don't understand the point ?

It's more a question of correlation/collinearity between parameters estimates (b0, b1, b2 ... and b'0, b'1, b'2 ...) that could be a problem.

 

Relaunching the tests, here are some results for the estimates correlations :

Victor_G_0-1749803372066.png

On the left, model version without "auto-centering of polynomials". You can see high VIFs for X1/X1² and X2 effects. Looking at correlation of estimates, you can see you have strong correlations between X1 and X1², X2 and X1*X2 parameter estimates, as well as inflated standard error for these main effects parameter estimates.

On the right, model version with auto-centering of polynomials. You can see low/acceptable VIFs for all effects (< 2), and even if some effect estimates are correlated, you avoid trivial correlation between parameter estimates (for example between the X1 and X1² effects estimates). The auto-centering avoid inflation of parameter estimates that share the same original variables (for example main effect and polynomial effect of X1).

 

I agree that the interpretation may be different in these situations because of the translation of transformed variables, as you're comparing the response value based on deviation from the mean of the variables, and not based on a deviation from 0.
You may find some good explanations about why it may not be recommended to mean-center the IV of a model : : https://stats.stackexchange.com/questions/65898/why-could-centering-independent-variables-change-the...

So to conclude, JMP do not mean-center IV to stay in the original scale and interpretation of effects, but do it for polynomials and interactions to avoid collinearity, and augment precision for their parameters estimates. The Prediction Profiler automatically use the original variables to better understand variable effects (and stay with the same variable "coordinates" no matter the effect).

 

Hope this will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
Jimvano7
Level III

Re: Centering IVs in rrgression

Thanks @Victor_G 

 

I know how the correlation among variables changes with mean centering, and I know when it is appropriate to center or not center. What I don't know is the interpretation of b0, b1, and b2 in this JMP derived model - Version 2 above when the model intermixes raw and mean-centered variables. Your response does not appear to answer my question. 

 

In my discussion about the Version 2 model, I asked about the meaning of b1. I showed that when attempting to interpret b1, we set X2 = 0, and we are left with Y = 83.558 + (-42.487)X1 + (8.158)X1'. In this case, b1 (-42.487) is NOT the slope of the influence of X1 on Y when X2=0. Instead, b1 does not appear to have an interpretable meaning. This same logic applies to X2.  In addition, there is never a place (in your dataset) where X1, X2, and X1'X2' are all equal to zero, so b0 is not the mean of Y when the two IVs and the product of the two IVs are equal to 0. So, what do these three coefficients mean???? 

MRB3855
Super User

Re: Centering IVs in rrgression

Hi @Jimvano7 : FWIW, I've been following this thread with some interest, and here is my two cents:

 

As you can see, interpretation of the parameter estimates in mean-centered model is difficult. The "intercept" is really just a constant to ensure a least squares fit, and the coefficients are not easily interpreted either. And, many of the corresponding  p-values in the tables (mean-centered vs. not mean-centered) aren't even testing the same hypothesis.

 

That said, all predictions are the same, and if you are careful you can show that all inferences are the same (when testing the same hypotheses, the p-values, etc are identical).

 

But, if it is the displayed output you are interested in, and an easy interpretation of that output is desired, then don't mean-center.

Jimvano7
Level III

Re: Centering IVs in rrgression

@MRB3855 

The different versions of the model (Versions 1-3) do not have the same hypothesis and make different predictions as I showed using JMP output in an earlier post. Version 1 and Version 3 - the hypotheses are clear to me and the interpretations of the coefficients (parameter estimates) all have meaning. And, the predictions are different because the questions being answered are different between V1 and V3. But, I have no idea what the hypothesis is for Version 2 (JMP version with intermixed raw and mean-centered variables for X1 and X2) because the coefficients have no meaning as far as I can tell. 

In Version 3, with all variables being mean-centered, all coefficients, including the intercept, have meaning.  b0 is not "just a constant to ensure a least squares fit."  It is the mean of Y when X1' is 0 (at the mean of X1) and X2' is 0 (at the mean of X2).

 

The only version that makes no sense to me is the JMP-derived intermixed model.

MRB3855
Super User

Re: Centering IVs in rrgression

Hi @Jimvano7 : So, if you completely expand the equation based on the output from versions 2 and 3, respectively, then gather like terms and simplify, you don't get the same equation as version1?

 

And it's easy to do without manually doing it; just save the predicted Y for each version as three new columns in your data table  (via Save Columns in the red triangle menu of the output). If you do that, the predicted Y's are different?

Recommended Articles