Table Subset Random Rows

ylee — Mon, 17 Jul 2023 23:30:03 GMT

Hello there,

I have a table containing 1mil rows = 1mil sample size, and 3000+ columns = 3000 parameters to study.

I have tried the Subset Random - sampling rate = 0.01 in attempt to reduce my data size to 10k rows, while still representing the initial big table sufficiently. Noted that most of the aggregated stats, CPK are still quite matched to the big table, but some tail observations may be excluded.

I couldn't find more details about this feature in the JMP help/manual, if could you share how the Random sampling is being done in the background? Is that a Random Uniform kind of selection, evenly distributed from row1 to rowN ? Or something else?

Thank you.

Re: Table Subset Random Rows

Jordan_Hiller — Tue, 18 Jul 2023 00:39:15 GMT

JMP is giving you a Simple Random Sample: each observation in the original dataset is equally likely to be in the subset.

Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.

topic Table Subset Random Rows in Discussions

Table Subset Random Rows

Re: Table Subset Random Rows