<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: using categorical data with random forests in JMP in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/578843#M78617</link>
    <description>&lt;P&gt;Are the mapped integer values using the nominal modeling type, or the default continuous modeling type?&lt;/P&gt;</description>
    <pubDate>Thu, 08 Dec 2022 13:24:53 GMT</pubDate>
    <dc:creator>Mark_Bailey</dc:creator>
    <dc:date>2022-12-08T13:24:53Z</dc:date>
    <item>
      <title>using categorical data with random forests in JMP</title>
      <link>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/578702#M78608</link>
      <description>&lt;P&gt;We are having a little trouble wrapping our heads around how JMP treats categorical data in random forests. We have created a small pilot data set and mapped the categorical data using a variety of techniques including many suggested in this forum. However, I don't really understand why we should see so much of a difference in performance when using these mappings. If I am mapping a discrete set of values to another discrete set of values (e.g., character strings to integers), why should it make so much of a difference in JMP?&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;We don't see this kind of variation when using Python or MATLAB's random forest algorithms. With JMP, the difference in error rates for held out data and on the training set are significant.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;We have read most of the posts on this topic, and can supply more specifics, including a trial data set, if necessary. But before we jump into that rabbit hole of choosing a method that optimizes performance in JMP, I was hoping someone could briefly explain why their implementation of random forests is so sensitive to how you map categorical data.&lt;BR /&gt;&lt;BR /&gt;Thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 09 Jun 2023 00:58:19 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/578702#M78608</guid>
      <dc:creator>TheTerminalMan</dc:creator>
      <dc:date>2023-06-09T00:58:19Z</dc:date>
    </item>
    <item>
      <title>Re: using categorical data with random forests in JMP</title>
      <link>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/578843#M78617</link>
      <description>&lt;P&gt;Are the mapped integer values using the nominal modeling type, or the default continuous modeling type?&lt;/P&gt;</description>
      <pubDate>Thu, 08 Dec 2022 13:24:53 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/578843#M78617</guid>
      <dc:creator>Mark_Bailey</dc:creator>
      <dc:date>2022-12-08T13:24:53Z</dc:date>
    </item>
    <item>
      <title>Re: using categorical data with random forests in JMP</title>
      <link>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/579509#M78678</link>
      <description>&lt;P&gt;Hi Mark,&lt;/P&gt;&lt;P&gt;Good point. My grad student doing this work said "oh!" :)&lt;/img&gt;&lt;/P&gt;&lt;P&gt;Thanks very much,&lt;/P&gt;&lt;P&gt;-Joe&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 09 Dec 2022 18:15:54 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/using-categorical-data-with-random-forests-in-JMP/m-p/579509#M78678</guid>
      <dc:creator>TheTerminalMan</dc:creator>
      <dc:date>2022-12-09T18:15:54Z</dc:date>
    </item>
  </channel>
</rss>

