cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar
dharding
Level II

Random row selection below a certain threshold

Hello All,

I am attempting to randomly select 20% of the observations in my table below 200, and then exclude those observations from any analysis. In other words, do a random row selection both at a given percentage and below a certain threshold. Thanks for any pointers!

1 ACCEPTED SOLUTION

Accepted Solutions
XanGregg
Staff

Re: Random row selection below a certain threshold

Two options I can think of.

1. Use Select Where to select rows with values < 200. Make a subset using selected rows and choose Link to Original Table. Make subset of that using a random sample and also choose Link to Original Table. Now select all the rows in the final subset and they will also be selected in the original, because of the linking.

2. Make a new column with a formula such as :value < 200 & random uniform() < 0.2.

View solution in original post

3 REPLIES 3
fugue
Level I

Re: Random row selection below a certain threshold

A simple approach would be to use a data step to get all the obs that satisfy your cutoff value (<200), apply one of the SAS RANDom functions to generate random numbers for each row, sort by the random number and then only keep the top (or bottom) 20%. Then, merge back with your original data to exclude drop those obs.

XanGregg
Staff

Re: Random row selection below a certain threshold

Two options I can think of.

1. Use Select Where to select rows with values < 200. Make a subset using selected rows and choose Link to Original Table. Make subset of that using a random sample and also choose Link to Original Table. Now select all the rows in the final subset and they will also be selected in the original, because of the linking.

2. Make a new column with a formula such as :value < 200 & random uniform() < 0.2.

dharding
Level II

Re: Random row selection below a certain threshold

Thanks Xan,

that works great!