- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Decision trees: same model inputs, different results
I just repeated a decision tree analysis that I originally did yesterday. The data input and all the modeling options were identical in both runs. However the results differed. I tried it again and got yet a different set of results. Why does this happen? It will be disconcerting to report results that can't be independently replicated.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Decision trees: same model inputs, different results
20% of your data is being held back for validation. The validation data is sampled randomly so you will get a different set each time.
If you have JMP Pro you can create a validation column and control the hold-back sample that way. If you don't have Pro, you can try this:
1. Add a new column and initialize the data as a random indicator - by default 20% will have the value 1 and 80% the value 0.
2. Exclude all the rows with value 1
3. Run the Partition platform (with 0 for validation portion),
It should automatically use the excluded rows for validation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Decision trees: same model inputs, different results
What version of JMP are you using? What options are you specifying in the launch dialog?
-Jeff
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Decision trees: same model inputs, different results
JMP 12.0.1
I have checked the informative missing box, held 0.2 for validation, and then manually performed 2 splits. The response variable is categorical. The same two X variables are selected in the different runs, but the cut-off points for the splits and model fitting results are different.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Decision trees: same model inputs, different results
20% of your data is being held back for validation. The validation data is sampled randomly so you will get a different set each time.
If you have JMP Pro you can create a validation column and control the hold-back sample that way. If you don't have Pro, you can try this:
1. Add a new column and initialize the data as a random indicator - by default 20% will have the value 1 and 80% the value 0.
2. Exclude all the rows with value 1
3. Run the Partition platform (with 0 for validation portion),
It should automatically use the excluded rows for validation.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Decision trees: same model inputs, different results
Thank you David! I should have figured that out myself.
Dave