Fitting a Linear Model in Generalized Regression

Learn more in our free online course:
Statistical Thinking for Industrial Problem Solving

In this video, we show how to fit linear models using generalized regression. We use the Chemical Manufacturing data and fit a least squares model for the continuous response, Yield. Then we fit a logistic regression model for the categorical response, Performance.

First, we fit a linear regression model for Yield.

To do this, we select Fit Model from the Analyze menu. We select Yield for the Y role, and select both sets of predictors as model effects.

Then we select Generalized Regression from the menu for Personality.

Because we have a continuous response, the default response distribution is normal. However, you can see that you can model several different response distributions.

We’ll run this model with the normal distribution.

You can see that we have fit a standard least squares regression model. The Model Summary table provides some familiar statistics, such as RSquare and RMSE. It also reports more-advanced statistics that are used for statistical modeling, such as BIC (the Bayesian Information Criterion), AICc (the corrected Akaike Information Criterion), and the negative LogLikelihood.

In the top section, you see terms such as “Link” and “Identity.”

We have built what is known as a generalized linear model. These terms are used to describe how the model was built.

In the Effects Test table at the bottom, you see the familiar significance tests and p-values for the individual terms in the model. Several of the terms are significant.

You also see a Parameter Estimates table. This reports the coefficients in the linear model.

In generalized regression, 0/1 indicator coding is used for categorical predictors.

There are several red triangle options available for the model. Because we have a continuous response, you can use this menu to access profilers, residual plots, and options to save columns to the data table.

What if we want to model the categorical response, Performance, instead of the continuous response?

We’ll select Model Dialog from the top red triangle and change the response from Yield to Performance.

You can see that the response distribution is now Binomial. For the target level, we’ll change this to Reject, and click Run.

Now you can see that JMP has fit a logistic regression model.

In the Model Summary table, you can see the estimation method, the response distribution, and other information.

But all of the other output looks the same.

That is, JMP reports the same measures of fit in the Model Summary table, and it reports effects tests and parameter estimates.

You can see that none of the terms are significant predictors of Performance. With only 90 observations, we don’t have a lot of data to fit this model.

Let’s look at the red triangle options for the model. Because we fit a logistic model, you can turn on options such as the confusion matrix, ROC curves, and lift curves.

You can see that generalized regression provides one framework to fit and explore a variety of different response distributions.

Learning Library

Related Articles

Which Model When

Model Comparison and Selection

Creating a Validation Column (Holdout Sample)

Naive Bayes

Neural Networks