Discussions

Antony · Jun 8, 2023 5:35 PM

Good morning everyone,

I would like to get an answer from you, as to not misinterpret my results.

I need to perform a multivariate analysis, assessing if the variables can predict the outcome cancer, so a binary response variable (0/1). With a dichotomous Y and multiple predictors in my model, I first tried using a nominal logistic regression, but in that case I got “unstable” estimates, therefore not reliable; in the same output I got a likelyhood-ratio test significant for my variable of interest.

After that, I tried instead fitting a GLM model with binomial distribution and logit function, selecting the Firth adjustment (as I got a warning about quasi-complete separation of data). In this way, I DID GET significant estimates and p-values.

Now, I know that fitting a GLM model with binomial distribution and logit link function should be a logistic regression, am I wrong? I don’t get why I don’t have the same results, what the differences are, and most important what I can infer on the bases of such results:

Can I call that model (the GLM model) a logistic regression or not?
I saw that in parameter estimates table from the GLM model L-R ChiSquare are shown, just as the LRT, and that Prob>ChiSq is the same between GLM model (before Firth adjustment) and LRT. How should I interpret the estimates of my GLM model, like those from a logistic regression or from a LRT? Isn’t Firth adjustment made for logistic regression?
In conclusion, what can I infer about having a significant result from the GLM model only?

Thank you very much in advance

Antony

PS. Both models could be poorly fit, but that's what I can do with my sample size. Any suggestion appreciated

Phil_Kay · Jun 23, 2021 07:29 AM

Hi,

First of all, a few clarifications about my example:

Models 1 and 2 have the same p-value (and other statistics) for the likelihood ratio tests. There are some differences in the reports in JMP but these are essentially the same model. Logistic regression and the GLM with binomial distribution and logit link are equivalent. Logistic regression is a special case of GLM.
You can call model 3 a logistic regression with Firth adjustment.
Model 3 does not have the same p-values and statistics as the other models. By using the Firth adjustment, the estimates are adjusted (no surprise!).

I am not an expert on Firth Adjustment. In fact, I have no real understanding of how it works.

So you can report the results of your version of model 3. You can say that it is a logistic regression with Firth adjustment. You can report the p-values to support hypothesis testing in the usual way. But you should probably also say that without the Firth adjustment the model was unstable.

Again, it is not possible for me to say why the model is unstable. But I can say that it should be a concern.

I would never recommend just reporting p-values from models as being conclusive evidence. JMP is a visual exploration environment. Before fitting statistical models I always use plots to check the quality of the data and make sense of the behaviours in the data. In this case, a visual exploration of the data might help you to understand why some models are unstable.

View solution in original post

Phil_Kay · Jun 22, 2021 05:37 AM

Hi,

First of all, I would be concerned that you got unstable estimates. You should do some work to explore why that is. It is hard for me to understand just from screenshots (and I am not good at languages other than English :( ).

I have attached a version of the Diabetes data set with saved scripts for models of the binary response using:

Logistic Regression,
GLM (binomial with Logit link),
and GLM with Firth bias adjustment.

You can see that the Effect Likelihood Ratio Tests for models 1 and 2 are the same. The Firth bias adjustment changes the statistics as you would expect.

You can find this in the JMP help documentation:

Firth Bias-Adjusted Estimates

Specifies that the Firth bias-adjusted method is used to fit the model. This maximum likelihood-based method has been shown to produce better estimates and tests than maximum likelihood-based models that do not use bias correction. In addition, bias-corrected MLEs ameliorate separation problems that tend to occur in logistic-type models. For more information about the separation problem in logistic regression, see Firth (1993) and Heinze and Schemper (2002).

I hope this helps,

Phil

Antony · Jun 22, 2021 06:57 AM

Hi,

Thank you for the kind reply and sorry for the Italian screenshots!

Yes, I noticed that the Effect Likelihoodratio Tests are the same between models 1 and 2. I also noticed that the p-values for parameter estimates in model 2 are the exact same before Firth adjustment.

After adjustment (I can do that from model 2 only) I have rather reliable or comprehensible estiamtes, which I do not in model 1, where estimates and p-values are poorly understandable. But in model 3 we still get the same p-values for the parameters estimates and Effect Likelihood ratio tests. So I don't get what one can say about the parameter estimates in model 2 or 3 if there is a significant p, nor if model 2 can be called a Firth-adjusted logistic regression.

As regards the instability, I suppose I get unstable estimates because I have few controls compared to cases, could that be? That is something I cannot correct at the moment.

So my last question is: what can I affirm, based on the output of model 3?

I really thank you,

Antony

Phil_Kay · Jun 23, 2021 07:29 AM

Hi,

First of all, a few clarifications about my example:

Models 1 and 2 have the same p-value (and other statistics) for the likelihood ratio tests. There are some differences in the reports in JMP but these are essentially the same model. Logistic regression and the GLM with binomial distribution and logit link are equivalent. Logistic regression is a special case of GLM.
You can call model 3 a logistic regression with Firth adjustment.
Model 3 does not have the same p-values and statistics as the other models. By using the Firth adjustment, the estimates are adjusted (no surprise!).

I am not an expert on Firth Adjustment. In fact, I have no real understanding of how it works.

So you can report the results of your version of model 3. You can say that it is a logistic regression with Firth adjustment. You can report the p-values to support hypothesis testing in the usual way. But you should probably also say that without the Firth adjustment the model was unstable.

Again, it is not possible for me to say why the model is unstable. But I can say that it should be a concern.

I would never recommend just reporting p-values from models as being conclusive evidence. JMP is a visual exploration environment. Before fitting statistical models I always use plots to check the quality of the data and make sense of the behaviours in the data. In this case, a visual exploration of the data might help you to understand why some models are unstable.

Antony · Jun 23, 2021 02:30 PM

Thank you very much!

Phil_Kay · Jun 23, 2021 07:43 AM

I just searched for "quasi-complete separation of data" and I found this. This explains what this means and how it leads to unstable estimates. Also how the Firth adjustment helps.

I have attached the example that they use as a data table with the same 3 models (Logistic, GLM, GLM with Firth) saved as scripts.

I hope this helps.

Phil

Discussions

Nominal logistic vs GLM with binomial distribution and logit function

Re: Nominal logistic vs GLM with binomial distribution and logit function

Re: Nominal logistic vs GLM with binomial distribution and logit function

Re: Nominal logistic vs GLM with binomial distribution and logit function

Re: Nominal logistic vs GLM with binomial distribution and logit function

Re: Nominal logistic vs GLM with binomial distribution and logit function

Re: Nominal logistic vs GLM with binomial distribution and logit function

Recommended Articles