cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
tnad
Level III

variable selection with a correlation cutoff

I'm screening through thousands of variables and would like to only keep ones that are not very highly correlated with each other for example: r2<0.95 and >-0.95. Is there an easy way to do this? I can use: "Multivariate methods > multivariate" to calculate r2, but I have no idea how I can make the selection according to cutoff above.

3 REPLIES 3

Re: variable selection with a correlation cutoff

You can use the pairwise correlations report instead of the default matrix version of the report. Click the red triangle at the top and choose Pairwise Correlations. Next, right-click the new report and select Sort by Column. Select the column with the p-values. Make sure to select the order that is most useful to you (ascending or descending). Right click the report again and select Make into Data Table if you like.

tnad
Level III

Re: variable selection with a correlation cutoff

Thanks. I can filter on p-value here, but I'm not sure how I can only keep variables that are not highly correlated with each other using this table

txnelson
Super User

Re: variable selection with a correlation cutoff

@Mark_Bailey suggestion will work, however, I find that doing what you are attempting can be done easily using 

     Analyze==>Screening==>Response Scrieening

It create a data table where you can sort, select, etc. the results.

Jim