cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Have your say in shaping JMP's future by participating in the new JMP Wish List Prioritization Survey

Practice JMP using these webinar videos and resources. We hold live Mastering JMP Zoom webinars with Q&A most Fridays at 2 pm US Eastern Time. See the list and register. Local-language live Zoom webinars occur in the UK, Western Europe and Asia. See your country jmp.com/mastering site.

Choose Language Hide Translation Bar
Developer Tutorial: Selecting the Appropriate JMP Pro Generalized Regression Distribution for Your Response

Background:

  • Simple Linear Regression assume errors (and response) are normal, but when normality isn't the case, predictions may fall outside of meaningful range (maybe not a big deal) and inference is not reliable (probably a bigger deal)
  • Generalized Linear Models (GLM) assume some other distribution than normality, for example for:
    • Count data (e.g., number of defects on a product)

    • Skewed data (e.g., salaries)

    • Proportions

    • Labels (e.g., good/neutral/bad or yellow/blue/green)

  • Generalized Linear Models (GLM) have three ingredients
    • A distribution for the response given the predictors (the random piece)
    • A linear predictor (the systematic piece)
    • A link function (the piece that random and systematic pieces)
  • Use the R-square to compare models within a distribution. 

  • Use AICc and BIC (information criteria.) to compare between distributions

    • AICc and BIC estimate the Kullback-Leibler divergence, which is the distance from the fitted model to the truth

    • Use them to compare models within the same distribution and across different distributions

    • Rule of thumb: AIC tends to overfit and BIC tends to underfit

  • General guidelines for choosing Distributions for Continuous Response
    • Do we have negative values? Use normal

    • Is it bound to (0,1)?  Use beta

    • Does variance increase with the mean?  Use gamma, Weibull, lognormal

    • Is it time-to-event/censored? Probably use Weibull or lognormal

    • A pretty good catch-all? Use normal

    • Do we suspect that we have outliers? Use Cauchy or t(5)

  • Choosing Distribution when response isn’t numeric
    • Is it two-level? Use the binomia; (e.g., Yes/No or A/B)
    • Is it 3+ levels and order matters? Use Ordinal logistic. (eg., Low/Medium/High or Small/Medium/Large
    • Is it 3+ levels and order doesn’t matter?  Use the Multinomial (e.g.,Pizza/Hamburger/Burrito or Red/Blue/Green/Orange)

See how to choose, specify and build, compare, and evaluate models using Generalized Regression in JMP Pro. Q&A and is included throughout the presentation.

 

 

Resources:

 

 

Comments

During the demo, these questions were addressed:

 

Can you use zero inflated for an inverse problem, where you have a lot of extreme high measures?

 

I typically look at reaction times, which tend to follow an ex-Gaussian distribution. That distribution is a conflation of an exponent and normal. Recommendations for model to use: Gamma, exponential?

 

Can we run a GLMM (with a random effect in the model) in JMP?

Yes, it is in JMP Pro 17 currently used by Early Adopters.

 

For the gamma, does JMP display lamda & variance?

 

Is there a way to use the Simulate feature in JMP Pro to do a Power analysis for an Ordinal Logistic model?

 

Does the simulate column have a random component?

 

 

Recommended Articles