Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Discriminant Analysis vs. Predictor Screening: Different hits from same data set...

Topic Options

- Start Article
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Sep 12, 2018 9:00 AM
(497 views)

Hi JMP Community,

First, let me apologize for not sharing actual data and result: I'm currently working on sensitive data sets that I have not had the time to anonymize.

I would like to better understand the differences between the Discriminant Analysis platform and the Predictor Screening platform. I understand that these represent different approaches with different assumptions but, if a combination of continuous variables were to score high in the Predictor Screening platform would it be reasonable to expect that at least some of the same variables be picked by the Discriminant Analysis platform (Stepwise Variable Selection)?

In other words, if the top hits from the Discriminant Analysis and the Predictor Screening are mostly different, does it strongly suggest that none of the variables entered in these models are actually associated with outcome?

Thank you for your help.

Sincerely,

TS

Thierry R. Sornasse

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

These techniques are very different with different assumptions.

Predictor screening is a random forest, which is a series of tree models. No distributional assumptions.

Discriminant analysis is a multivariate technique that is fairly sensitive to the normality assumption. Plus, there can be large sample sizes required to estimate some discriminant models (quadratic, regularized, etc.)

Which is best and "correct" for you? Who knows? Check assumptions closely. If the data are large, could you perform the analysis on multiple subsets (which is what predictor screening does automatically)?

To quote George Box, all models are wrong, since are useful. Look for something that is useful.

Predictor screening is a random forest, which is a series of tree models. No distributional assumptions.

Discriminant analysis is a multivariate technique that is fairly sensitive to the normality assumption. Plus, there can be large sample sizes required to estimate some discriminant models (quadratic, regularized, etc.)

Which is best and "correct" for you? Who knows? Check assumptions closely. If the data are large, could you perform the analysis on multiple subsets (which is what predictor screening does automatically)?

To quote George Box, all models are wrong, since are useful. Look for something that is useful.

Dan Obermiller

1 REPLY 1

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Predictor screening is a random forest, which is a series of tree models. No distributional assumptions.

Discriminant analysis is a multivariate technique that is fairly sensitive to the normality assumption. Plus, there can be large sample sizes required to estimate some discriminant models (quadratic, regularized, etc.)

Which is best and "correct" for you? Who knows? Check assumptions closely. If the data are large, could you perform the analysis on multiple subsets (which is what predictor screening does automatically)?

To quote George Box, all models are wrong, since are useful. Look for something that is useful.

Dan Obermiller