Discussions

owiuser · Mar 29, 2016 09:33 AM

I just repeated a decision tree analysis that I originally did yesterday. The data input and all the modeling options were identical in both runs. However the results differed. I tried it again and got yet a different set of results. Why does this happen? It will be disconcerting to report results that can't be independently replicated.

David_Burnham · Mar 29, 2016 04:58 PM

20% of your data is being held back for validation. The validation data is sampled randomly so you will get a different set each time.

If you have JMP Pro you can create a validation column and control the hold-back sample that way. If you don't have Pro, you can try this:

1. Add a new column and initialize the data as a random indicator - by default 20% will have the value 1 and 80% the value 0.

2. Exclude all the rows with value 1

3. Run the Partition platform (with 0 for validation portion),

It should automatically use the excluded rows for validation.

-Dave

View solution in original post

Jeff_Perkinson · Mar 29, 2016 09:49 AM

What version of JMP are you using? What options are you specifying in the launch dialog?

-Jeff

owiuser · Mar 29, 2016 10:20 AM

JMP 12.0.1

I have checked the informative missing box, held 0.2 for validation, and then manually performed 2 splits. The response variable is categorical. The same two X variables are selected in the different runs, but the cut-off points for the splits and model fitting results are different.

David_Burnham · Mar 29, 2016 04:58 PM

20% of your data is being held back for validation. The validation data is sampled randomly so you will get a different set each time.

If you have JMP Pro you can create a validation column and control the hold-back sample that way. If you don't have Pro, you can try this:

1. Add a new column and initialize the data as a random indicator - by default 20% will have the value 1 and 80% the value 0.

2. Exclude all the rows with value 1

3. Run the Partition platform (with 0 for validation portion),

It should automatically use the excluded rows for validation.

-Dave

owiuser · Mar 31, 2016 03:01 PM

Thank you David! I should have figured that out myself.

Dave

Discussions

Decision trees: same model inputs, different results

Re: Decision trees: same model inputs, different results

Re: Decision trees: same model inputs, different results

Re: Decision trees: same model inputs, different results

Re: Decision trees: same model inputs, different results

Re: Decision trees: same model inputs, different results

Recommended Articles