cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Learn how to build custom Python data connectors and further customize JMP’s Data Connector Framework with the Python Data Connector Demo, available now in the JMP Marketplace!
  • See how to create experiments to support product design and ID useful product features. Register for June 12 webinar, 2pm US Eastern Time.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
Liz_S
Level II

When running Logistic Regressions with no intercepts, has anyone observed very high General RSquares, about 25-30 points higher than models with intercepts?

On a couple different projects with a 0/1 outcome variable to predict, I have noticed that running the logistic regression (generalized, with binomial variance) checking the No Intercept box boosts the Generalized R Square substantially, about 25 or 30 percentage points.  My latest model is for a rare event that is observed at about 1.2% in the experience period population.  Predictive models with an intercept yield Generalized R Squares at about .60 to .67, while models without the intercept are at about .95 to .98.  I do have some key variables that are highly predictive, so at first I thought the Gen R Squares above 90% seemed reasonable.  But the confusion matrix shows more errors than I would like, even lowering the threshold to 2%-5%.  I do like the idea that No Intercept implies a blind log-odds ratio for the constant term since Intercept=0, like flipping a coin, Probability =0.50. But perhaps these models are too easy to beat, inflating the Generalized R Square that depends on the likelihood ratios of (L0=intercept only model) to (LM fitted models with X predictors).  Particularly if I know before hand (a priori) that the outcome event is rare.  So, while it would be great to write a brief that has Generalized R Square about 95%-98%, I think it might be more prudent and practical if I use a model with an intercept that comes in at Gen R Square at 60%.  Please respond back if you have any advise for me.  Thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Liz_S
Level II

Re: When running Logistic Regressions with no intercepts, has anyone observed very high General RSquares, about 25-30 points higher than models with intercepts?

Hello, this morning I received a detailed response from JMP Support with numerous suggestions that will help me be more efficient in JMP when modeling, as well as to try some techniques I have never used yet.  The response (from Patrick Giuliano) referenced this article and advice: "The importance of "many approaches" leads to a common and defendable solution. From Lavine, M., Frequentist, Bayes, or Other? (Summarized in Editorial THE AMERICAN STATISTICIAN, 2019, VOL. 73, NO. 51, 1-19): 1. Look for and present results from many models that fit the data well. 2. Evaluate models, not just procedures."

Essentially, I learned that the very high Generalized R Squares (~98%) for the no-intercept models probably indicate a lack of stability; that it was too strong of an assumption to force the linear models through the origin.  Perhaps I also should revisit some of the modeling issues created by the multicollinearity in the predictors.  It was a helpful reply!  I appreciate being able to reach out to JMP Support with my de-identified data and my scripts.  Thanks much!

 

View solution in original post

4 REPLIES 4

Re: When running Logistic Regressions with no intercepts, has anyone observed very high General RSquares, about 25-30 points higher than models with intercepts?

I compared the models with and without an intercept in a few examples and always observed the opposite trend: the R square metrics were better when the model included an intercept term. If you can reproduce the results that you reported then I suggest contacting JMP Technical Support (support@jmp.com) to get resolution. Please reply to this discussion to capture their findings for the benefit of the Community.

Liz_S
Level II

Re: When running Logistic Regressions with no intercepts, has anyone observed very high General RSquares, about 25-30 points higher than models with intercepts?

Yes, I am sending in my example JMP file to JMP support today.  The model without an intercept has Generalized RSquare at 98% and the models with an intercept has Generalized RSquare at 55%.  I'll keep the community posted.

Liz_S
Level II

Re: When running Logistic Regressions with no intercepts, has anyone observed very high General RSquares, about 25-30 points higher than models with intercepts?

Hello, this morning I received a detailed response from JMP Support with numerous suggestions that will help me be more efficient in JMP when modeling, as well as to try some techniques I have never used yet.  The response (from Patrick Giuliano) referenced this article and advice: "The importance of "many approaches" leads to a common and defendable solution. From Lavine, M., Frequentist, Bayes, or Other? (Summarized in Editorial THE AMERICAN STATISTICIAN, 2019, VOL. 73, NO. 51, 1-19): 1. Look for and present results from many models that fit the data well. 2. Evaluate models, not just procedures."

Essentially, I learned that the very high Generalized R Squares (~98%) for the no-intercept models probably indicate a lack of stability; that it was too strong of an assumption to force the linear models through the origin.  Perhaps I also should revisit some of the modeling issues created by the multicollinearity in the predictors.  It was a helpful reply!  I appreciate being able to reach out to JMP Support with my de-identified data and my scripts.  Thanks much!

 

Re: When running Logistic Regressions with no intercepts, has anyone observed very high General RSquares, about 25-30 points higher than models with intercepts?

I'm glad that you got a helpful answer. Best of luck in all your modeling!

Recommended Articles