cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
ylee
Level III

Table Subset Random Rows

Hello there,

 

I have a table containing 1mil rows = 1mil sample size, and 3000+ columns = 3000 parameters to study.

 

I have tried the Subset Random - sampling rate = 0.01 in attempt to reduce my data size to 10k rows, while still representing the initial big table sufficiently.  Noted that most of the aggregated stats, CPK are still quite matched to the big table, but some tail observations may be excluded.

 

I couldn't find more details about this feature in the JMP help/manual, if could you share how the Random sampling is being done in the background?  Is that a Random Uniform kind of selection, evenly distributed from row1 to rowN ?  Or something else?

 

Thank you.

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Table Subset Random Rows

JMP is giving you a Simple Random Sample: each observation in the original dataset is equally likely to be in the subset.

Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.

View solution in original post

1 REPLY 1

Re: Table Subset Random Rows

JMP is giving you a Simple Random Sample: each observation in the original dataset is equally likely to be in the subset.

Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.