cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • JMP will suspend normal business operations for our Winter Holiday beginning on Wednesday, Dec. 24, 2025, at 5:00 p.m. ET (2:00 p.m. ET for JMP Accounts Receivable).
    Regular business hours will resume at 9:00 a.m. EST on Friday, Jan. 2, 2026.
  • We’re retiring the File Exchange at the end of this year. The JMP Marketplace is now your destination for add-ins and extensions.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
LGH
LGH
Level I

Decision Tree question

Hi,

I created a simple DOE table as attached and tried to do a decision tree but JMP does not produce result as expect.

Any input is appreciate.

Thanks,

Long

1 ACCEPTED SOLUTION

Accepted Solutions
Victor_G
Super User

Re: Decision Tree question

Hi @LGH,

Welcome in the Community !

I would recommend reading the post from @jthi Getting correct answers to correct questions quickly before posting in the Forum, as you provide no explanation to your expected result, and the one you obtain from the Decision Tree. 

If your dataset comes from a designed experiment (DoE), why not using the least squares model regression provided by default in JMP ?

Is your problem coming from the fact that your sample size is too low, which prevents the Decision Tree to split the data (so no results, no splitting possible when using Decision Tree "as it is" ?) ? If yes, click on the red triangle next to the Decision Tree to open the options, and change the Minimum Size Split (by default at 5) to another value that allows the Decision Tree to grow in your case (2 could work, you can have one split) :

Victor_G_0-1765957782960.png

The default setting in JMP for this Decision Tree parameter is set to avoid overfitting: creating a Decision Tree that splits perfectly the samples depending on the variables, leading to a perfect model with leaves having low sample size, but practically useless and non-robust when using it to predict new unseen samples (test dataset). In practice, to use Machine Learning models like this one, you would need to have a validation strategy, using training, validation and test sets. There are methods to avoid splitting "too much" your dataset and not using it to train the model (like cross-validation, leave-one-out, etc...), but at the end you'll still need a test set to ensure prediction performances from test set are consistent with the training/validation sets. You can read some discussions on this topic in the forum CROSS VALIDATION - VALIDATION COLUMN METHOD / How to perform k-fold cross-validation on ML models (e.g. neural network and random forest) and chec... / ...

Note that in this particular case, due to the low dataset size and the linear relationship between the results and factor C, the results would be the same with a linear regression model or the platform Fit Y by X:

Victor_G_1-1765957894374.png

Hope this answer will give you some ideas,

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

2 REPLIES 2
Victor_G
Super User

Re: Decision Tree question

Hi @LGH,

Welcome in the Community !

I would recommend reading the post from @jthi Getting correct answers to correct questions quickly before posting in the Forum, as you provide no explanation to your expected result, and the one you obtain from the Decision Tree. 

If your dataset comes from a designed experiment (DoE), why not using the least squares model regression provided by default in JMP ?

Is your problem coming from the fact that your sample size is too low, which prevents the Decision Tree to split the data (so no results, no splitting possible when using Decision Tree "as it is" ?) ? If yes, click on the red triangle next to the Decision Tree to open the options, and change the Minimum Size Split (by default at 5) to another value that allows the Decision Tree to grow in your case (2 could work, you can have one split) :

Victor_G_0-1765957782960.png

The default setting in JMP for this Decision Tree parameter is set to avoid overfitting: creating a Decision Tree that splits perfectly the samples depending on the variables, leading to a perfect model with leaves having low sample size, but practically useless and non-robust when using it to predict new unseen samples (test dataset). In practice, to use Machine Learning models like this one, you would need to have a validation strategy, using training, validation and test sets. There are methods to avoid splitting "too much" your dataset and not using it to train the model (like cross-validation, leave-one-out, etc...), but at the end you'll still need a test set to ensure prediction performances from test set are consistent with the training/validation sets. You can read some discussions on this topic in the forum CROSS VALIDATION - VALIDATION COLUMN METHOD / How to perform k-fold cross-validation on ML models (e.g. neural network and random forest) and chec... / ...

Note that in this particular case, due to the low dataset size and the linear relationship between the results and factor C, the results would be the same with a linear regression model or the platform Fit Y by X:

Victor_G_1-1765957894374.png

Hope this answer will give you some ideas,

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
LGH
LGH
Level I

Re: Decision Tree question

Hi Victor,

Thank you very much for your reply.  Yes, my problem was no splitting due to small sample size.  Your instruction works.

Regards,

LGH

Recommended Articles