Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Re: Partial least squares parameter variance

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Feb 5, 2019 12:46 PM
(2612 views)

Hopefully this question makes sense. I'm doing logistic regression with N=2000 observations and p=160 covariates. There is obvious correlation structure in the design matrix and I can see that simply looking at the VIF's when I do simple logistic regression. I then did PLS and get a different set of coefficients, though there are still 160 of them as PLS does not do variable reduction. Is there a place in the PLS report to find the variances of the new set of coefficients to see that they are reduced? Does it make sense that I'd want to look at these since I might be trading bias for variance.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

4 REPLIES 4

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Partial least squares parameter variance

Gene:

Your questions make sense...however can I offer an alternative thought? If variable reduction is key to your practical problem, have you considered using either the Lasso or Elastic Net subpersonalities from the Fit Model -> Generalized Regression personality/platform? These techniques are suited for multicollinearity among the predictors...but they have some important characteristics to consider as well...these are explained in the JMP documentation. Both can be used with a categorical response...and the 'final' parameter estimate confidence intervals are reported right along with the table of parameter estimates.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Partial least squares parameter variance

Thanks Peter,

In my case it is really all about the correlation structure of the predictors. I'm not convinced that variable reduction is warranted in my application. In fact, that is a question I haven't answered yet and I'm searching for a definitive way to test whether overfitting is occuring even when I use all my predictors. I have N=1839 observations with p=148 covariates. I've included the table. If we just use regular regression (treat the nominal output as continuous) we can see that there are plenty of very large VIF's. So we know we will also have a problem when we do logistic regression. We have correlation problems for dure, but maybe not an overfitting problem. That's why I went with PLS. If you cluster the variables you get only 35 clusters, suggesting that there are only 35 or so informative features within the 148 covariates. So why focus on eliminating any of them instead of just using a method robust to correlation.

When I run logistic regression I get an AUC of only .815. So even using all the covariates doesn't give me a perfect fit. Is AUC < 1 a legit way to assert that overfitting is not the issue.

I welcome any thoughts you have.

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Partial least squares parameter variance

Thanks Peter,

One thing that troubles me is the large difference in the models when adaptive lasso is selected. You can see in the adaptive model that the Scaled-LogLikelihood value doesn't have a well defined minimum. Another issue is that you have modeled Y=0 and you get 59 terms in the model. If you target Y=1 you get a completely different model with 71 terms. I don't get this. Without the adaptive set it doesn't matter whether I target 0 or 1, the models are the same except the terms have different signs. Since I find the adaptive results difficult to understand, let alone explain to my customers, I stick with vanilla lasso.

With vanilla lasso we can get reasonable performance in the confusion matrices with a threshold of .1522. I say reasonable in the sense that the required sensitivity and specificity are perhaps met.

That said, we still have the question of whether we need to do regularization at all. Even vanilla logistic regression results in te 2 AUC's being somewhat close. But we wouldn't want to use LR because we know there is serious correlation. So, without significant evidence of overfitting, I'd move to PLS in order to address only the correlation issue. But since JMP models the target as continuous we don't get all the ROC, AUC and confusion matrices in the report. I can get them by applying the logit to the linear predictor and I think that is legit...and that circles back to my original question on how to treat PLS results for model comparison when the other models are based strictly on logit transformation of the linear predictors.

By the way, I greatly appreciate your diving into these questions since my own thorough understanding of what the heck I'm doing is very important to me.

I've included the same data table with a few more models.