We are having a little trouble wrapping our heads around how JMP treats categorical data in random forests. We have created a small pilot data set and mapped the categorical data using a variety of techniques including many suggested in this forum. However, I don't really understand why we should see so much of a difference in performance when using these mappings. If I am mapping a discrete set of values to another discrete set of values (e.g., character strings to integers), why should it make so much of a difference in JMP?
We don't see this kind of variation when using Python or MATLAB's random forest algorithms. With JMP, the difference in error rates for held out data and on the training set are significant.
We have read most of the posts on this topic, and can supply more specifics, including a trial data set, if necessary. But before we jump into that rabbit hole of choosing a method that optimizes performance in JMP, I was hoping someone could briefly explain why their implementation of random forests is so sensitive to how you map categorical data.
Thanks.