turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Can Logistic Regression estimate original model pa...

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 13, 2013 9:02 AM
(944 views)

I am new to JMP and Logistic Regression, and I would like to test the estimation of model parameters using Logistic Regression. I have simulated categorical response data from model parameters (betas) that I put in the simulation. I have included all different types of independent variables; nominal, ordinal and continuous, in the model. Then I generated each independent variable using a random number generator, calculated the probability of the response for each instance of randomly generated independent variable, and then generated the response variable (1 or 0 ) depending upon the calculated probability of the response variable. I repeated this until I had a large number of data points (each data 'point' being - one instance of all the independent variables and the corresponding response variable) that I fed into the logistic regression analyis. My questions are:

1 Can I expect the Logistic Regression to estimate the original Betas?

2. Should the estimates asymptotically approach the true values of betas (that I used to generate the data) as I increase the number of data points?

3. What happens in case of the Nominal variable which are estimated by the Dummy Variable method in Logistic Regression? I have an independent variable that takes on 3 nominal values, for which I used 3 corresponding values of Betas to generate the simulated data. The output of the regression analysis comes up with two betas, and the third values is negative of the sum of these two (if I understand the method correctly.)

Any help in developing this understanding will be appreciated.

Thanks in advance.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Solution

Here I am replying to myself. But, I have done a number of simulations and here is what I have found out. If I simulate data with categorical variable coefficients (betas) that have the DOF as assumed by JMP I must assign values of N simulated variable coefficients, a1, a2, a3....aN as [a1, a2, a3...aN=-(a1+a2+a3...a(N-1)]. In this case, the original values are correctly estimated by running Logistic Regression on the simulated data. If I do not use the last coefficient aN according to the constraint: aN=-(a1+a2+a3...a(N-1), and just assign an arbitrarily different value to it, then the estimated coefficient estimates are different than the original values. But, the estimated betas yield the probability values that are the same as the probabilities computed from the original beta values.

7 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 13, 2013 9:59 AM
(619 views)

I'd like to say no, because you can't account for the covariance between the variables but my stats knowledge is very fuzzy today.

Generally though, if your model has multiple parameters at a time or any interaction terms you may not be able to recreate your Beta's.

If you're doing it one variable (univariate) at a time perhaps.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 13, 2013 10:11 AM
(619 views)

Thanks. I did not use any interaction between any of the independent variables. Each IV is independent of others.

Your answer gives me one thing to try. I will simulate the data with one variable and see if at least that allows me to estimate my "true" beta from the simulated data.

Any suggestions about my third question about interpretation of the dummy variables used in place of the nominal variables?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 13, 2013 10:29 AM
(619 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 14, 2013 6:15 AM
(619 views)

1) Yes, generically.

2) Yes.

3) It depends on the parameterization, but the reference level is usually combined with the intercept estimate.

For more on this topic, see Chapter 11-12 in Simulating Data with SAS, particularly Section 12.2.2.

For a discussion of the effect parameterizations, see SAS/STAT(R) 12.3 User's Guide

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 15, 2013 5:51 AM
(619 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Nov 18, 2013 6:20 AM
(619 views)

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content