Hello there,
I have a table containing 1mil rows = 1mil sample size, and 3000+ columns = 3000 parameters to study.
I have tried the Subset Random - sampling rate = 0.01 in attempt to reduce my data size to 10k rows, while still representing the initial big table sufficiently. Noted that most of the aggregated stats, CPK are still quite matched to the big table, but some tail observations may be excluded.
I couldn't find more details about this feature in the JMP help/manual, if could you share how the Random sampling is being done in the background? Is that a Random Uniform kind of selection, evenly distributed from row1 to rowN ? Or something else?
Thank you.