cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Choose Language Hide Translation Bar
View Original Published Thread

Developer Tutorial: Selecting the Appropriate JMP Pro Generalized Regression Distribution for Your Response

Published on ‎11-07-2024 03:30 PM by Staff | Updated on ‎11-07-2024 05:40 PM

Background:

  • Simple Linear Regression assume errors (and response) are normal, but when normality isn't the case, predictions may fall outside of meaningful range (maybe not a big deal) and inference is not reliable (probably a bigger deal)
  • Generalized Linear Models (GLM) assume some other distribution than normality, for example for:
    • Count data (e.g., number of defects on a product)

    • Skewed data (e.g., salaries)

    • Proportions

    • Labels (e.g., good/neutral/bad or yellow/blue/green)

  • Generalized Linear Models (GLM) have three ingredients
    • A distribution for the response given the predictors (the random piece)
    • A linear predictor (the systematic piece)
    • A link function (the piece that random and systematic pieces)
  • Use the R-square to compare models within a distribution. 

  • Use AICc and BIC (information criteria.) to compare between distributions

    • AICc and BIC estimate the Kullback-Leibler divergence, which is the distance from the fitted model to the truth

    • Use them to compare models within the same distribution and across different distributions

    • Rule of thumb: AIC tends to overfit and BIC tends to underfit

  • General guidelines for choosing Distributions for Continuous Response
    • Do we have negative values? Use normal

    • Is it bound to (0,1)?  Use beta

    • Does variance increase with the mean?  Use gamma, Weibull, lognormal

    • Is it time-to-event/censored? Probably use Weibull or lognormal

    • A pretty good catch-all? Use normal

    • Do we suspect that we have outliers? Use Cauchy or t(5)

  • Choosing Distribution when response isn’t numeric
    • Is it two-level? Use the binomia; (e.g., Yes/No or A/B)
    • Is it 3+ levels and order matters? Use Ordinal logistic. (eg., Low/Medium/High or Small/Medium/Large
    • Is it 3+ levels and order doesn’t matter?  Use the Multinomial (e.g.,Pizza/Hamburger/Burrito or Red/Blue/Green/Orange)

See how to choose, specify and build, compare, and evaluate models using Generalized Regression in JMP Pro. Q&A and is included throughout the presentation.

 

Selecting the Appropriate JMP Pro Generalized Regression Distribution for your Response
Video Player is loading.
Current Time 0:00
Duration 1:10:54
Loaded: 0%
Stream Type LIVE
Remaining Time 1:10:54
 
1x
    • Chapters
    • descriptions off, selected
    • captions off, selected
    • en (Main), selected
    (view in My Videos)

     

    Resources:

     

     



    0 Kudos
    0 Comments