Developer Tutorial: Selecting the Appropriate JMP Pro Generalized Regression Distribution for Your Response
Background:
- Simple Linear Regression assume errors (and response) are normal, but when normality isn't the case, predictions may fall outside of meaningful range (maybe not a big deal) and inference is not reliable (probably a bigger deal)
- Generalized Linear Models (GLM) assume some other distribution than normality, for example for:
-
Count data (e.g., number of defects on a product)
-
Skewed data (e.g., salaries)
-
Proportions
-
Labels (e.g., good/neutral/bad or yellow/blue/green)
-
- Generalized Linear Models (GLM) have three ingredients
- A distribution for the response given the predictors (the random piece)
- A linear predictor (the systematic piece)
- A link function (the piece that random and systematic pieces)
-
Use the R-square to compare models within a distribution.
-
Use AICc and BIC (information criteria.) to compare between distributions
-
AICc and BIC estimate the Kullback-Leibler divergence, which is the distance from the fitted model to the truth
-
Use them to compare models within the same distribution and across different distributions
-
Rule of thumb: AIC tends to overfit and BIC tends to underfit
-
- General guidelines for choosing Distributions for Continuous Response
-
Do we have negative values? Use normal
-
Is it bound to (0,1)? Use beta
-
Does variance increase with the mean? Use gamma, Weibull, lognormal
-
Is it time-to-event/censored? Probably use Weibull or lognormal
-
A pretty good catch-all? Use normal
-
Do we suspect that we have outliers? Use Cauchy or t(5)
-
- Choosing Distribution when response isn’t numeric
- Is it two-level? Use the binomia; (e.g., Yes/No or A/B)
- Is it 3+ levels and order matters? Use Ordinal logistic. (eg., Low/Medium/High or Small/Medium/Large
- Is it 3+ levels and order doesn’t matter? Use the Multinomial (e.g.,Pizza/Hamburger/Burrito or Red/Blue/Green/Orange)
See how to choose, specify and build, compare, and evaluate models using Generalized Regression in JMP Pro. Q&A and is included throughout the presentation.
- Chapters
- descriptions off, selected
- captions settings, opens captions settings dialog
- captions off, selected
This is a modal window.
Beginning of dialog window. Escape will cancel and close the window.
End of dialog window.
This is a modal window. This modal can be closed by pressing the Escape key or activating the close button.
Resources:
- Developer Tutorial: Using JMP Pro Generalized Regression to Better Understand Observational Data
- Developer Tutorial: Using JMP Pro Generalized Regression to Analyze Designed Experiments
- Generalized Regression documentation
