topic Feature selection in JMP in Discussions

Feature selection in JMP

FR60 — Tue, 24 Jan 2017 16:35:00 GMT

I knwow that in JMP Pro for feature selection (I'm not interested in future extraction...) there is the Generalized regression module. Do you know if a similar module is in JMP (standard version) too? If not which is the best method to use with JMP12/13 to reduce the number of predictors to leave only the most important ones?

Thanks Felice

Re: Feature selection in JMP

Bill_Worley — Tue, 24 Jan 2017 17:42:04 GMT

Hello Felice,

There is no Generalized Regression per se in regular JMP. You do have a few of options though. If you have highly correlated factors/inputs you can use either Partial Least Squares or Principal Component Analysis. One other option would be to use Partition Analysis/Decision Trees. Be sure to use a holdout set to check the predictive capability of any model you build. For partition analysis look under the red hot spot at the top and select Column Contributions to see which factors are most important.

Hope this helps.

Bill

Re: Feature selection in JMP

Chris_Kirchberg — Tue, 24 Jan 2017 17:48:07 GMT

Hi Felice,

There is something called Predictor Screening as well in JMP. Here is a link about that option.

http://www.jmp.com/support/help/13/Predictor_Screening.shtml

That can help reduce the number of possible predictors and is in JMP.

Re: Feature selection in JMP

Dan_Obermiller — Tue, 24 Jan 2017 20:16:05 GMT

You've gotten some good choices from the previous responses (some of which I will repeat), but there are others.

You could just use stepwise regression (a classic). You could use All-Subsets regression. You could use variable clustering. You could use Partition. You could use PLS. You could use PCR (principal component regression). There are graph-based methods: normal plots, pareto plot, bayes plot. You could use predictor screening, etc.

There is a new course being created right now called "JMP Software: Finding Important Predictors" that addresses this issue and covers these techniques, using both JMP and JMP Pro. This two day class will likely be available for delivery at customer locations starting in April. It will be offered as a public class for the first time at the Discovery conference this October in St. Louis.

Re: Feature selection in JMP

Peter_Bartell — Wed, 25 Jan 2017 12:03:26 GMT

In addition to all the great suggestions from my colleagues above, if you have a relatively small number of variables I'd also look at the Multivariate Methods -> Multivariate -> Scatterplot Matrix platform. This will give you a nice matrix of pairwise correlations that may exist among the variables and this can be valuable in trying to discover correlations that might exist within the predictor set. These correlations don't play nice with some regression procedures such as ordinary least squares regression (called Standard Least Squares in JMP). If you go down the standard least squares path, make sure to check the variance inflation factors in the parameter estimates table for indications of predictor correlations that may be present.

Re: Feature selection in JMP

FR60 — Wed, 25 Jan 2017 15:46:00 GMT

Great

I used it. Very nice.

Felice

Re: Feature selection in JMP

FR60 — Wed, 25 Jan 2017 16:06:36 GMT

Ciao Dan

thank you very much for your msg.

I have some comment to do....

You could just use stepwise regression (a classic).

Can it manipulate more than 1K predictors?

You could use variable clustering.

Can you give me more details on this tecnique on how to choose important predictors through clustering?

You could use Partition.

Tipically in our Fab we use it after removing not important predictors (let's consider that generally we have more than thousands predictors and a lot of noisy in our data ....)

You could use PCR (principal component regression).

We know this but we loose information on predictor meaning and then we prefer don't use it.

There are graph-based methods: normal plots, pareto plot, bayes plot.

I don't know how to do feature selection with graph. Sorry.

This is a great news. I hope that will be available for Italy too. If yes for sure I will follow them.

Rgds. Felice

Re: Feature selection in JMP

Dan_Obermiller — Wed, 25 Jan 2017 16:49:13 GMT

My answers to your questions are in blue.

@FR60 wrote:

Ciao Dan

thank you very much for your msg.

I have some comment to do....

You could just use stepwise regression (a classic).

Can it manipulate more than 1K predictors?

Yes. If you have the memory on your machine to handle large problems. I just ran a simple example with 10,000 observations and 2,000 predictors. Stepwise worked fine.

You could use variable clustering.

Can you give me more details on this tecnique on how to choose important predictors through clustering?

This would be an approach that is similar to principal components analysis, but instead of you looking at loading plots to see similar variables, JMP will cluster them for you automatically. You can then choose the variable that is most representative of the cluster or even create the "typical" variable for the cluster. This will help you avoid the "redundant information" you often see with many variables.

You could use Partition.

Tipically in our Fab we use it after removing not important predictors (let's consider that generally we have more than thousands predictors and a lot of noisy in our data ....)

There are many ways to use Partition, but think of that very first split. The approach needs to determine which split contains the most information. That is a variable selection. You could also use a trick that Dick DeVeaux calls "shaking the tree". Split many, many, times then look at the column contributions of the variables to identify the most important ones. So many ways to use this flexible platform!

You could use PCR (principal component regression).

We know this but we loose information on predictor meaning and then we prefer don't use it.

Understood. I am not a big fan of PCR for this reason. However, by looking at the loadings of the variables for only the significant principal components, you could possibly identify the original variables that are important. Use those important original variables to start building your model. There is nothing that says you must stick with the principal components.

There are graph-based methods: normal plots, pareto plot, bayes plot.

I don't know how to do feature selection with graph. Sorry.

Nothing to be sorry about. That is why we are creating the class. People are often unaware of these tools.

There is a new course being created right now called "JMP Software: Finding Important Predictors" that addresses this issue and covers these techniques, using both JMP and JMP Pro. This two day class will likely be available for delivery at customer locations starting in April.

This is a great news. I hope that will be available for Italy too. If yes for sure I will follow them.

All of the classes that we create at SAS in the U.S. are available to the international SAS offices, too. If they do not have an instructor that knows the topic, they can request one from another region that does have the skill set. Just ask your local SAS office for the training!

Rgds. Felice

I hope my comments have been helpful.

Dan

Re: Feature selection in JMP

FR60 — Wed, 25 Jan 2017 17:09:35 GMT

Ciao Dan

your comments were really welcome and more than helpful. Just last question about clustering.

This would be an approach that is similar to principal components analysis, but instead of you looking at loading plots to see similar variables, JMP will cluster them for you automatically. You can then choose the variable that is most representative of the cluster or even create the "typical" variable for the cluster. This will help you avoid the "redundant information" you often see with many variables.

Once I have n cluster (some rule on how choose the size of this number ..) how I can identify the most representative variable for each cluster?

Thanks. Felice

Re: Feature selection in JMP

Dan_Obermiller — Wed, 25 Jan 2017 17:31:19 GMT

JMP will tell you which variable is most representative of the cluster. It will also determine the proper number of clusters to use.