cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 

Practice JMP using these webinar videos and resources. We hold live Mastering JMP Zoom webinars with Q&A most Fridays at 2 pm US Eastern Time. See the list and register. Local-language live Zoom webinars occur in the UK, Western Europe and Asia. See your country jmp.com/mastering site.

Choose Language Hide Translation Bar
Developer Tutorial: Selecting the Appropriate JMP Pro Generalized Regression Distribution for Your Response

Background:

  • Simple Linear Regression assume errors (and response) are normal, but when normality isn't the case, predictions may fall outside of meaningful range (maybe not a big deal) and inference is not reliable (probably a bigger deal)
  • Generalized Linear Models (GLM) assume some other distribution than normality, for example for:
    • Count data (e.g., number of defects on a product)

    • Skewed data (e.g., salaries)

    • Proportions

    • Labels (e.g., good/neutral/bad or yellow/blue/green)

  • Generalized Linear Models (GLM) have three ingredients
    • A distribution for the response given the predictors (the random piece)
    • A linear predictor (the systematic piece)
    • A link function (the piece that random and systematic pieces)
  • Use the R-square to compare models within a distribution. 

  • Use AICc and BIC (information criteria.) to compare between distributions

    • AICc and BIC estimate the Kullback-Leibler divergence, which is the distance from the fitted model to the truth

    • Use them to compare models within the same distribution and across different distributions

    • Rule of thumb: AIC tends to overfit and BIC tends to underfit

  • General guidelines for choosing Distributions for Continuous Response
    • Do we have negative values? Use normal

    • Is it bound to (0,1)?  Use beta

    • Does variance increase with the mean?  Use gamma, Weibull, lognormal

    • Is it time-to-event/censored? Probably use Weibull or lognormal

    • A pretty good catch-all? Use normal

    • Do we suspect that we have outliers? Use Cauchy or t(5)

  • Choosing Distribution when response isn’t numeric
    • Is it two-level? Use the binomia; (e.g., Yes/No or A/B)
    • Is it 3+ levels and order matters? Use Ordinal logistic. (eg., Low/Medium/High or Small/Medium/Large
    • Is it 3+ levels and order doesn’t matter?  Use the Multinomial (e.g.,Pizza/Hamburger/Burrito or Red/Blue/Green/Orange)

See how to choose, specify and build, compare, and evaluate models using Generalized Regression in JMP Pro. Q&A and is included throughout the presentation.

 

 

Resources:

 

 

Comments

During the demo, these questions were addressed:

 

Can you use zero inflated for an inverse problem, where you have a lot of extreme high measures?

 

I typically look at reaction times, which tend to follow an ex-Gaussian distribution. That distribution is a conflation of an exponent and normal. Recommendations for model to use: Gamma, exponential?

 

Can we run a GLMM (with a random effect in the model) in JMP?

Yes, it is in JMP Pro 17 currently used by Early Adopters.

 

For the gamma, does JMP display lamda & variance?

 

Is there a way to use the Simulate feature in JMP Pro to do a Power analysis for an Ordinal Logistic model?

 

Does the simulate column have a random component?

 

 

Recommended Articles