- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Table Subset Random Rows
Hello there,
I have a table containing 1mil rows = 1mil sample size, and 3000+ columns = 3000 parameters to study.
I have tried the Subset Random - sampling rate = 0.01 in attempt to reduce my data size to 10k rows, while still representing the initial big table sufficiently. Noted that most of the aggregated stats, CPK are still quite matched to the big table, but some tail observations may be excluded.
I couldn't find more details about this feature in the JMP help/manual, if could you share how the Random sampling is being done in the background? Is that a Random Uniform kind of selection, evenly distributed from row1 to rowN ? Or something else?
Thank you.
1 ACCEPTED SOLUTION
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Table Subset Random Rows
JMP is giving you a Simple Random Sample: each observation in the original dataset is equally likely to be in the subset.
Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.
Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.
1 REPLY 1
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Table Subset Random Rows
JMP is giving you a Simple Random Sample: each observation in the original dataset is equally likely to be in the subset.
Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.
Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.