BookmarkSubscribeRSS Feed
Choose Language Hide Translation Bar
owiuser
Community Trekker

Decision trees: same model inputs, different results

I just repeated a decision tree analysis that I originally did yesterday. The data input and all the modeling options were identical in both runs. However the results differed. I tried it again and got yet a different set of results. Why does this happen?  It will be disconcerting to report results that can't be independently replicated.

0 Kudos
1 ACCEPTED SOLUTION

Accepted Solutions
David_Burnham
Super User

Re: Decision trees: same model inputs, different results

20% of your data is being held back for validation.  The validation data is sampled randomly so you will get a different set each time.

If you have JMP Pro you can create a validation column and control the hold-back sample that way.  If you don't have Pro, you can try this:

1. Add a new column and initialize the data as a random indicator - by default 20% will have the value 1 and 80% the value 0.

2. Exclude all the rows with value 1

3. Run the Partition platform (with 0 for validation portion),

It should automatically use the excluded rows for validation.

-Dave
4 REPLIES 4
Jeff_Perkinson
Community Manager Community Manager

Re: Decision trees: same model inputs, different results

What version of JMP are you using? What options are you specifying in the launch dialog?

-Jeff

-Jeff
0 Kudos
owiuser
Community Trekker

Re: Decision trees: same model inputs, different results

JMP 12.0.1

I have checked the informative missing box, held 0.2 for validation, and then manually performed 2 splits.  The response variable is categorical.  The same two X variables are selected in the different runs, but the cut-off points for the splits and model fitting results are different.

0 Kudos
David_Burnham
Super User

Re: Decision trees: same model inputs, different results

20% of your data is being held back for validation.  The validation data is sampled randomly so you will get a different set each time.

If you have JMP Pro you can create a validation column and control the hold-back sample that way.  If you don't have Pro, you can try this:

1. Add a new column and initialize the data as a random indicator - by default 20% will have the value 1 and 80% the value 0.

2. Exclude all the rows with value 1

3. Run the Partition platform (with 0 for validation portion),

It should automatically use the excluded rows for validation.

-Dave
owiuser
Community Trekker

Re: Decision trees: same model inputs, different results

Thank you David!  I should have figured that out myself.

Dave

0 Kudos