cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
matteo_patelmo
Level IV

GLM / Poisson with Overdispersion // use of AICc

Hello.

 

I have encountered the following issue while modeling with GLM/Poisson the data in the attached table (Response = Count, Regressor = X).

 

Data visual exploration suggests a clear relationship between X and Count, but comparison of AICc for model with X vs null model favors the Null model (just intercept).  Scripts to reproduce the two models are embedded.

 

I believe I have found an explanation in the following note in the book by Burnham Anderson  (Model Selection and Multimodel Inference Second Edition, 2002), but would like someone to confirm. If this is right, it is unfortunate that this has not been fixed in JMP.

 

One must be careful when using some standard software packages (e.g.,
SAS GENMOD), since they were developed some time ago under a hypothesis testing mode (i.e., adjusting χ2 test statistics by cˆ to obtain F-tests). In
some cases, a separate estimate of c is made for each model, and variances
and covariances are multiplied by this model-specific estimate of the variance
inflation factor. Some software packages compute an estimate of c for every
model, thus making the correct use of model selection criteria tricky unless
one is careful. Instead, we recommend that the global model be used as a basis
for the estimation of a single variance inflation factor c.

 

ps. using negative binomial in generalized regression (lasso), the min aicc model is the one with X as regressor, as one would expect.

 

Thanks,

Matteo

2 ACCEPTED SOLUTIONS

Accepted Solutions

Re: GLM / Poisson with Overdispersion // use of AICc

The AICc is useful information when selecting a model from among many model candidates. Are there more candidates that you did not present here?

 

I do not think that AICc is valid when you fit the model with only the constant term. Here is the Whole Model Test for the model with X and the model without X.

 

with X.PNG

With X

 

without X.PNG

Without X

 

It doesn't make sense that the Reduced -LogLikelihood is so different because it should be the same model (intercept only). The large difference in AICc is due to this discrepancy. I do not think that it is due to the issue raised in the literature that you cited. That issue is about the earlier practice of computing a separate VIF for each model.

 

The Regression Plot and the Studentized Deviance Residual by Predicted plot show a good fit with X

 

You have a single regression, X. The inferential tests provided in the GLM should suffice to decide if X is important.

whole model test.PNG

 

Conclusions:

  • The whole model is significant.
  • The term for X is significant.
  • Over-dispersion is not significant.
  • Lack of fit is not significant.

 

View solution in original post

Re: GLM / Poisson with Overdispersion // use of AICc

Thanks for your diligence in bringing this one to our attention!  We have confirmed that the AICc calculation is incorrect, and we have identified it as a fix in a future release of JMP.  -JMP Technical Support

View solution in original post

6 REPLIES 6

Re: GLM / Poisson with Overdispersion // use of AICc

The AICc is useful information when selecting a model from among many model candidates. Are there more candidates that you did not present here?

 

I do not think that AICc is valid when you fit the model with only the constant term. Here is the Whole Model Test for the model with X and the model without X.

 

with X.PNG

With X

 

without X.PNG

Without X

 

It doesn't make sense that the Reduced -LogLikelihood is so different because it should be the same model (intercept only). The large difference in AICc is due to this discrepancy. I do not think that it is due to the issue raised in the literature that you cited. That issue is about the earlier practice of computing a separate VIF for each model.

 

The Regression Plot and the Studentized Deviance Residual by Predicted plot show a good fit with X

 

You have a single regression, X. The inferential tests provided in the GLM should suffice to decide if X is important.

whole model test.PNG

 

Conclusions:

  • The whole model is significant.
  • The term for X is significant.
  • Over-dispersion is not significant.
  • Lack of fit is not significant.

 

matteo_patelmo
Level IV

Re: GLM / Poisson with Overdispersion // use of AICc

Thanks Mark, I will study in detail your explanation, still a bit tricky for me :).

 

Matteo

matteo_patelmo
Level IV

Re: GLM / Poisson with Overdispersion // use of AICc

Hello Mark,  some answers/comments (your statements in red).

 

 Are there more candidates that you did not present here?  No, this is a very simple case (but real data), good in my opinion to understand the underlying statistical machinery.

 

I do not think that AICc is valid when you fit the model with only the constant term.  If this is so, why do both GLM and Generalized Regression output AICc for the null models and the latter displays it  in the solution path ? 

 

thanks if you can further clarify this.
Matteo

 

 

Re: GLM / Poisson with Overdispersion // use of AICc

Both platforms include AICc for all models fit by them. They do not distinguish this unusual case and omit AICc.

 

It appears, however, that AICc might not be correct in this case. The calculation is under investigation.

matteo_patelmo
Level IV

Re: GLM / Poisson with Overdispersion // use of AICc

Thanks again Mark. Looking forward to receiving more updates on the investigation!

 

Matteo

Re: GLM / Poisson with Overdispersion // use of AICc

Thanks for your diligence in bringing this one to our attention!  We have confirmed that the AICc calculation is incorrect, and we have identified it as a fix in a future release of JMP.  -JMP Technical Support