Solved: Zero Inflated Poisson Model Parameterization

farinoushsharif · Aug 3, 2017 02:09 PM

Hello JMP Community,

I have a dataset with a count variable to be predicted (Y), 3 numerical variables(X1, X2, X3) and one categorical (X4) as independent variables.

I fitted the Zero inflated Poisson model to the data both in R and JMP (from fit model > generalized regression). But the results were not the same. Why are they different? I think there may be some unseen assumptions. I mean some assumptions like the optimization method and number of iterations. How can I see the underlying assumptions of the Zero inflated Poisson model in JMP?

I compared different ZIP models in R and JMP for different variables. The results are always the same, and it is different only when I include that categorical variable (X4) in my model.

Mark_Bailey · Aug 3, 2017 04:13 PM

I do not want to seem 'picky' but we use terms with specific meanings. I don't think that it has anything to do with the model assumptions, but it might have something to do with the model parameterization, the categorical predictor in particular. I don't know how the model parameters are represented for estimation in the R object that you are using. I do know how JMP parameterizes categorical predictors.

Here is the place to discover the JMP parameterization for categorical factors in JMP 12. (The parameterization in JMP 13 is the same, but the location in the documentation changes!)

Select Help > Books > Fitting Linear Models > Appendix A: Statistical Details > The Factor Models. This section describes the coding for both nominal and ordinal predictor levels. The effects coding used in JMP is different that than the coding used in some SAS procedures, such as PROC GLM, as you will see. It might be different than the coding used in your R object.

I hope that this information will help resolve the differences that you have observed.

View solution in original post

Mark_Bailey · Aug 3, 2017 03:13 PM

What version of JMP Pro are you using? The algorithms are improved over time.

Which estimation method in Generalized Regression did you use? The lasso, double lasso, ridge regression, and elastic net, for example, produce different results.

What options did you use for the fitting? The adaptive versions will produce different results.

What validation method did you use? The random assignment of observations to different hold out sets produces different results each time you fit the data.

farinoushsharif · Aug 3, 2017 03:21 PM

Thank you for the response!

I am using JMP Pro 12.0.1. I used maximum likelihood and no validation in both R and JMP.

Also, I've just found it is not only about the zero inflated model. Fitting a general linear model ( poisson) to the data have different results in R and JMP. Again when I take out the categorical variable, the models are the same. Is it possible there are some assumptions about categorical data?

Mark_Bailey · Aug 3, 2017 04:13 PM

I do not want to seem 'picky' but we use terms with specific meanings. I don't think that it has anything to do with the model assumptions, but it might have something to do with the model parameterization, the categorical predictor in particular. I don't know how the model parameters are represented for estimation in the R object that you are using. I do know how JMP parameterizes categorical predictors.

Here is the place to discover the JMP parameterization for categorical factors in JMP 12. (The parameterization in JMP 13 is the same, but the location in the documentation changes!)

Select Help > Books > Fitting Linear Models > Appendix A: Statistical Details > The Factor Models. This section describes the coding for both nominal and ordinal predictor levels. The effects coding used in JMP is different that than the coding used in some SAS procedures, such as PROC GLM, as you will see. It might be different than the coding used in your R object.

I hope that this information will help resolve the differences that you have observed.

ron_horne · Aug 4, 2017 07:05 AM

Hi,

JMP may be using a different coding method for contrasting categories when estimating coefficients of categorical variables. If it has only two categories using the ordinal modeling option for the variable will give the same result as other programs typically estimate a dummy variable with zeros and ones.

Zero Inflated Poisson Model Parameterization

Re: Zero Inflated Poisson Model Assumptions

Re: Zero Inflated Poisson Model Assumptions

Re: Zero Inflated Poisson Model Assumptions

Re: Zero Inflated Poisson Model Assumptions

Re: Zero Inflated Poisson Model Assumptions