<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Table Subset Random Rows in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Table-Subset-Random-Rows/m-p/659442#M84870</link>
    <description>&lt;P&gt;Hello there,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a table containing 1mil rows = 1mil sample size, and 3000+ columns = 3000 parameters to study.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have tried the Subset Random - sampling rate = 0.01 in attempt to reduce my data size to 10k rows, while still representing the initial big table sufficiently.&amp;nbsp; Noted that most of the aggregated stats, CPK are still quite matched to the big table, but some tail observations may be excluded.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I couldn't find more details about this feature in the JMP help/manual, if could you share how the Random sampling is being done in the background?&amp;nbsp; Is that a Random Uniform kind of selection, evenly distributed from row1 to rowN ?&amp;nbsp; Or something else?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 17 Jul 2023 23:30:03 GMT</pubDate>
    <dc:creator>ylee</dc:creator>
    <dc:date>2023-07-17T23:30:03Z</dc:date>
    <item>
      <title>Table Subset Random Rows</title>
      <link>https://community.jmp.com/t5/Discussions/Table-Subset-Random-Rows/m-p/659442#M84870</link>
      <description>&lt;P&gt;Hello there,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a table containing 1mil rows = 1mil sample size, and 3000+ columns = 3000 parameters to study.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have tried the Subset Random - sampling rate = 0.01 in attempt to reduce my data size to 10k rows, while still representing the initial big table sufficiently.&amp;nbsp; Noted that most of the aggregated stats, CPK are still quite matched to the big table, but some tail observations may be excluded.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I couldn't find more details about this feature in the JMP help/manual, if could you share how the Random sampling is being done in the background?&amp;nbsp; Is that a Random Uniform kind of selection, evenly distributed from row1 to rowN ?&amp;nbsp; Or something else?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 17 Jul 2023 23:30:03 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Table-Subset-Random-Rows/m-p/659442#M84870</guid>
      <dc:creator>ylee</dc:creator>
      <dc:date>2023-07-17T23:30:03Z</dc:date>
    </item>
    <item>
      <title>Re: Table Subset Random Rows</title>
      <link>https://community.jmp.com/t5/Discussions/Table-Subset-Random-Rows/m-p/659461#M84871</link>
      <description>JMP is giving you a Simple Random Sample: each observation in the original dataset is equally likely to be in the subset. &lt;BR /&gt;&lt;BR /&gt;Sounds like you want a sample that is stratified by CPK. If you have JMP Pro you could do this with the “Make Validation Column” utility.</description>
      <pubDate>Tue, 18 Jul 2023 00:39:15 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Table-Subset-Random-Rows/m-p/659461#M84871</guid>
      <dc:creator>Jordan_Hiller</dc:creator>
      <dc:date>2023-07-18T00:39:15Z</dc:date>
    </item>
  </channel>
</rss>

