Hierarchical regression

emeritus · Jun 8, 2023 5:47 PM

How do I specify blocks of variables in a hierarchical regression

Thanks

Georg · Apr 10, 2022 06:19 AM

The question is how to grab many columns?

May be you want to use the Column Filter Menu:

Launch Windows in Platforms (jmp.com)

Georg

ih · Apr 11, 2022 05:54 PM

If you refer to Hierarchical Linear Regression then I don't think this a separate platform in JMP. That could be because there are fairly simple alternatives (at least based on my understanding of Hierarchical Linear Regression):

adding/removing terms in the standard leas squares report,
stepwise regression and,
ensemble modeling.

Note that for ensemble modeling you can extend this beyond just linear regression, you could decide whether to include a partition after a liner regression.

Add/Remove Terms

The effect summary section of the least squares report has options to add/remove terms, you can compare your summary statistics just as is described in the stepwise section.

Stepwise

Instead of adding variables in blocks, you can add them individually in any order and just stop after adding each block. Using fit model add all of the possible terms in every group as model effects. For example, if you want to predict age in the big class data set from height and weight and then decide if you should add the height*weight term, you would add all three terms in the model launch dialog box. Then enter the terms in your first block, record model fit parameter's, and then add your second block and compare. It is possible that you can use the settings in the Stepwise Regression Control section to automatically get you to the model you would have chosen yourself.

In Fit Model launch add all model effects and select stepwise:

Add the first block's variables (I am ignoring the P value for weight):

Now add the second block and notice how the RSquare went up slightly, but so did the AIC and BIC, likely indicating that you would not want to include the second block.

Ensemble Modeling

In fit model, build a model using the first block and save the residuals. Then try to predict the residuals using subsequent models.

Build a model using the first block: Here I converted age to continuous to make the example simpler.

Save the predictions and the residuals:

Now launch fit model again and predict the residual from the first model using the second block, remember that neither of these needed to be linear models. You might get a message that you are missing an effect because JMP (correctly) doesn't want to build a model from the cross term without each individual term. In this example they were already included in the previous step but you could add it again.

Now you can use the results in this model to decide whether to include the second block in your final model. In this case the p value for both terms is very high, indicating that you should not include block 2.

To get the prediction using both models, just add the predictions from each:

emeritus · Apr 13, 2022 02:32 PM

The idea of hierarchical regression seems to have different meaning to different folks. Here is what I am looking for:

Hierarchical regression is a way to show if variables of your interest explain a statistically significant amount of variance in your Dependent Variable (DV) after accounting for all other variables. This is a framework for model comparison rather than a statistical method. In this framework, you build several regression models by adding variables to a previous model at each step; later models always include smaller models in previous steps. In many cases, our interest is to determine whether newly added variables show a significant improvement in R2 (the proportion of explained variance in DV by the model).

The idea is to account for variance from sources that might confound or conceal the effects of the variable of interest. For instance, TV effects. A model might want to know if aggressive activity is "caused" by watching cartoon violence, but there are other factors, so a hierarchical model might include blocks of variable entered one by one. With all spurious variance accounted of by a blocks R sq, the last term is the variable of interest. The model might look like this:

Aggressive behavior measure = (age, gender) + (education, income) + (TV violence exposure)

You look at the R sq for each block. (You disregard the activity within each block.) If there is still an interesting Rsq for TV violence, you make the claim that it is significant even after the other variables are taken into account. This way you can make a pretty strong claim.

It's easy to do in SPSS, and I am a little disappointed that JMP doesn't offer a similar way to sequentially create blocks of variable.

Unless there is something JMP figured out with Stepwise, UGH. Back in grad school we were taught that stepwise is "know nothing" regression because the order variables are entered has to do with the program's algorithm and not theory. This is easy to show by dropping variables in and out of a stepwise model and watching the coefficients go nuts. The JMP document on stepwise suggests running a complex model and then dropping variables out based on there "significance" NOT

I will look at the example you gave me at the end of your post, but I am not sure that is what I am looking for

Thanks for the help

statman · Apr 13, 2022 03:18 PM

Your explanation is much better than your original post, thanks. Hierarchy (or nested) is a specific term used to designate a specific relationship between components of variation or factors. Hierarchy implies there is a rational order that must be accounted for. For example Batch-to-batch and within batch. Within batch must be nested within batch...you can't get multiple batches from a within batch sample.

Here are my thoughts:

Regarding "Hierarchical regression is a way to show if variables of your interest explain a statistically significant amount of variance in your Dependent Variable (DV) after accounting for all other variables."

1. Statistical significance is a conditional statement. It is dependent on inference space, and what components are being compared. It is completely dependent on how the data was collected. I'm not exactly sure what you mean by "after accounting for all other variables"? Analysis using type III SS will evaluate each term in your model after all of the other terms in your model are accounted for, but I don't think that is what you mean...

2. Context is always necessary to Interpret the analysis done by the software. Context must be provided by the topic "experts".

3. Designing a data collection plan that is driven by hypotheses and subsequently includes the sources of variation representative of those hypotheses is something you do...not the software. How you choose to subset those sources and what comparisons you want to make is your decision. Whether those sources are nested, systematic or crossed is a function of how you collect the data.

4. Rsq's are only one of the metrics used to develop models. I would not restrict the model building to that one value and I would use Rsq adjusted as default. Consider the delta between Rsq and Rsq adjusted (for over specifying the model), RMSE, p-values, residuals and residual plots, et. al.

5. "In this framework, you build several regression models by adding variables to a previous model at each step; later models always include smaller models in previous steps." This is essentially stepwise regression. You can choose the components to start with and what components to add or you can let the software evaluate this based on criteria you set.

https://www.jmp.com/support/help/en/16.2/?os=mac&source=application&utm_source=helpmenu&utm_medium=a...

You may also use partitioning platforms to perform similar functions.

https://www.jmp.com/support/help/en/16.2/?os=mac&source=application&utm_source=helpmenu&utm_medium=a...

Another option in JMP is to compare models:

https://www.jmp.com/support/help/en/16.2/?os=mac&source=application&utm_source=helpmenu&utm_medium=a...

6. "It's easy to do in SPSS, and I am a little disappointed that JMP doesn't offer a similar way to sequentially create blocks of variable." I don't use SPSS, but if you could show us what you mean, we might be able to guide you. What do you mean that the software sequentially creates blocks? You enter hypotheses into SPSS and it properly creates blocks for you? I'd like to see that.

7. Aggressive behavior measure = (age, gender) + (education, income) + (TV violence exposure). There are many more variables you have not included or are confounded with your model. Demographics, family unit, personal trauma, upbringing, exposure to real violence. And how do you measure aggressive behavior? Is your measurement system consistent? Does you measurement system bias the results?

"All models are wrong, some are useful" G.E.P. Box

Mark_Bailey · Apr 14, 2022 09:39 AM

This procedure is what he is talking about.

emeritus · Apr 18, 2022 08:26 PM

That is exactly what I used to do in SPSS. I would like to do that in JMP as well

Mark_Bailey · Apr 14, 2022 09:49 AM

You do not have to click Go or Step in the Stepwise platform and use the automated model selection approach that you wish to avoid. You can specify all the terms in the full model using the Analyze > Fit Model launch dialog and then add them in Stepwise in groups as you deem suitable from a priori knowledge. That is to say, you can identify blocks by the terms that you enter together. The history section allows you to select any previous model using a radio button and it returns to that model. Some information that you want, e.g., change in R square, is not directly available.

emeritus · Apr 18, 2022 08:34 PM

This helps.

Too bad about R sq, that is an important diagnostic I use.

The old SPSS routine was really useful for folks doing survey research, especially looking to isolate the effects of media variables.

Folks on this discussion have been really helpful, but I am seeing the JMP implementation I want as a bit of a hack

If I were advising JMP I would ask that this routine be made a bit more transparent.

Also for some of us old guys the idea of stepwise regression puts up a red flag that dates back to the original maroon SPSS manual. Maybe I am the only one left and its not an issue.

Mark_Bailey · Jun 16, 2022 09:29 AM

I am sorry for taking a long time to get back to this discussion. I recommend that you add this request in the Wish List area. JMP Development uses this resource to obtain ideas from users.

Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression

Re: Hierarchical regression