Discussions

matteo_patelmo · Oct 1, 2022 10:39 AM

Hello.

I have encountered the following issue while modeling with GLM/Poisson the data in the attached table (Response = Count, Regressor = X).

Data visual exploration suggests a clear relationship between X and Count, but comparison of AICc for model with X vs null model favors the Null model (just intercept). Scripts to reproduce the two models are embedded.

I believe I have found an explanation in the following note in the book by Burnham Anderson (Model Selection and Multimodel Inference Second Edition, 2002), but would like someone to confirm. If this is right, it is unfortunate that this has not been fixed in JMP.

One must be careful when using some standard software packages (e.g.,
SAS GENMOD), since they were developed some time ago under a hypothesis testing mode (i.e., adjusting χ2 test statistics by cˆ to obtain F-tests). In
some cases, a separate estimate of c is made for each model, and variances
and covariances are multiplied by this model-specific estimate of the variance
inflation factor. Some software packages compute an estimate of c for every
model, thus making the correct use of model selection criteria tricky unless
one is careful. Instead, we recommend that the global model be used as a basis
for the estimation of a single variance inflation factor c.

ps. using negative binomial in generalized regression (lasso), the min aicc model is the one with X as regressor, as one would expect.

Thanks,

Matteo

Mark_Bailey · Oct 3, 2022 10:40 AM

The AICc is useful information when selecting a model from among many model candidates. Are there more candidates that you did not present here?

I do not think that AICc is valid when you fit the model with only the constant term. Here is the Whole Model Test for the model with X and the model without X.

With X

Without X

It doesn't make sense that the Reduced -LogLikelihood is so different because it should be the same model (intercept only). The large difference in AICc is due to this discrepancy. I do not think that it is due to the issue raised in the literature that you cited. That issue is about the earlier practice of computing a separate VIF for each model.

The Regression Plot and the Studentized Deviance Residual by Predicted plot show a good fit with X

You have a single regression, X. The inferential tests provided in the GLM should suffice to decide if X is important.

Conclusions:

The whole model is significant.
The term for X is significant.
Over-dispersion is not significant.
Lack of fit is not significant.

View solution in original post

PatrickGiuliano · Nov 12, 2022 12:59 AM

Thanks for your diligence in bringing this one to our attention! We have confirmed that the AICc calculation is incorrect, and we have identified it as a fix in a future release of JMP. -JMP Technical Support

View solution in original post

Mark_Bailey · Oct 3, 2022 10:40 AM

The AICc is useful information when selecting a model from among many model candidates. Are there more candidates that you did not present here?

I do not think that AICc is valid when you fit the model with only the constant term. Here is the Whole Model Test for the model with X and the model without X.

With X

Without X

It doesn't make sense that the Reduced -LogLikelihood is so different because it should be the same model (intercept only). The large difference in AICc is due to this discrepancy. I do not think that it is due to the issue raised in the literature that you cited. That issue is about the earlier practice of computing a separate VIF for each model.

The Regression Plot and the Studentized Deviance Residual by Predicted plot show a good fit with X

You have a single regression, X. The inferential tests provided in the GLM should suffice to decide if X is important.

Conclusions:

The whole model is significant.
The term for X is significant.
Over-dispersion is not significant.
Lack of fit is not significant.

matteo_patelmo · Oct 5, 2022 03:57 AM

Thanks Mark, I will study in detail your explanation, still a bit tricky for me :).

Matteo

matteo_patelmo · Oct 5, 2022 04:23 AM

Hello Mark, some answers/comments (your statements in red).

Are there more candidates that you did not present here? No, this is a very simple case (but real data), good in my opinion to understand the underlying statistical machinery.

I do not think that AICc is valid when you fit the model with only the constant term. If this is so, why do both GLM and Generalized Regression output AICc for the null models and the latter displays it in the solution path ?

thanks if you can further clarify this.
Matteo

Mark_Bailey · Oct 5, 2022 08:20 AM

Both platforms include AICc for all models fit by them. They do not distinguish this unusual case and omit AICc.

It appears, however, that AICc might not be correct in this case. The calculation is under investigation.

matteo_patelmo · Oct 5, 2022 01:16 PM

Thanks again Mark. Looking forward to receiving more updates on the investigation!

Matteo

PatrickGiuliano · Nov 12, 2022 12:59 AM

Thanks for your diligence in bringing this one to our attention! We have confirmed that the AICc calculation is incorrect, and we have identified it as a fix in a future release of JMP. -JMP Technical Support

Discussions

GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Re: GLM / Poisson with Overdispersion // use of AICc

Recommended Articles