cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.

Developer Tutorial: Selecting the Appropriate JMP Pro Generalized Regression Distribution for Your Response

Published on ‎11-07-2024 03:30 PM by Staff | Updated on ‎11-07-2024 05:40 PM

Background:

  • Simple Linear Regression assume errors (and response) are normal, but when normality isn't the case, predictions may fall outside of meaningful range (maybe not a big deal) and inference is not reliable (probably a bigger deal)
  • Generalized Linear Models (GLM) assume some other distribution than normality, for example for:
    • Count data (e.g., number of defects on a product)

    • Skewed data (e.g., salaries)

    • Proportions

    • Labels (e.g., good/neutral/bad or yellow/blue/green)

  • Generalized Linear Models (GLM) have three ingredients
    • A distribution for the response given the predictors (the random piece)
    • A linear predictor (the systematic piece)
    • A link function (the piece that random and systematic pieces)
  • Use the R-square to compare models within a distribution. 

  • Use AICc and BIC (information criteria.) to compare between distributions

    • AICc and BIC estimate the Kullback-Leibler divergence, which is the distance from the fitted model to the truth

    • Use them to compare models within the same distribution and across different distributions

    • Rule of thumb: AIC tends to overfit and BIC tends to underfit

  • General guidelines for choosing Distributions for Continuous Response
    • Do we have negative values? Use normal

    • Is it bound to (0,1)?  Use beta

    • Does variance increase with the mean?  Use gamma, Weibull, lognormal

    • Is it time-to-event/censored? Probably use Weibull or lognormal

    • A pretty good catch-all? Use normal

    • Do we suspect that we have outliers? Use Cauchy or t(5)

  • Choosing Distribution when response isn’t numeric
    • Is it two-level? Use the binomia; (e.g., Yes/No or A/B)
    • Is it 3+ levels and order matters? Use Ordinal logistic. (eg., Low/Medium/High or Small/Medium/Large
    • Is it 3+ levels and order doesn’t matter?  Use the Multinomial (e.g.,Pizza/Hamburger/Burrito or Red/Blue/Green/Orange)

See how to choose, specify and build, compare, and evaluate models using Generalized Regression in JMP Pro. Q&A and is included throughout the presentation.

 

Selecting the Appropriate JMP Pro Generalized Regression Distribution for your Response
Video Player is loading.
Current Time 0:00
Duration 1:10:54
Loaded: 0%
Stream Type LIVE
Remaining Time 1:10:54
 
1x
    • Chapters
    • descriptions off, selected
    • captions off, selected
    • en (Main), selected
    (view in My Videos)

     

    Resources:

     

     



    0 Kudos
    0 Comments