cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar

Bootstrap Forest Platform > "validation" column vs "validation" portion

2 (basic?) questions about Bootstrap forest platform:

- So I can specify a validation column, tagging the records as "training", "validation" or "test" (60%,20%,20%).
But there is also a "validation Portion" option (default as 0). What is this portion ? And is it discarded if validation column is specified ?

 

- I provide validation records, so I assume the platform will run hyperparams optimization. Which platform report lists the hyperparams search history ?

1 ACCEPTED SOLUTION

Accepted Solutions
Victor_G
Super User

Re: Bootstrap Forest Platform > "validation" column vs "validation" portion

Hi @LargeDormouse25,

 

Concerning your questions :

  1. Validation portion can be helpful if you didn't specify validation (and test) sets using the Validation role. If you add a validation column with training, validation (and test) sets using the Validation role, the validation portion will be ignored :
    Victor_G_0-1741073873130.png
  2. Validation is mostly used in this platform to enable "Early Stopping", to better assess the impact of number of trees and their depths. The validation set will be used to prevent overfitting by using too many/too deep individual trees in the forest.
    Note that you can fine-tune Bootstrap Forest hyperparameters (as well as other ML models in JMP) by using a Tuning Design Table. See how to setup your table and use it for hyperparameters tuning here : Additional Example of the Bootstrap Forest Platform
    If you used a tuning design table, the list of hyperparameters combinations tested and their results will be displayed in the analysis report :
     
    TuningReport.png

    Some previous conversations about Tuning design table for hyperparameters fine-tuning :
    Is there a way to do k-fold cross validation with boosted tree? 
    Boosted Tree - Tuning TABLE DESIGN 
    Malfunction in Bootstrap Forest with Tuning Design Table? 
    Want to run the same tuning design table for multiple y-variables 

    Note that Random Forest (called Bootstrap Forest in JMP) are one of the most robust supervised learning model available in Machine Learning. Unlike other possible algorithms, it's one of the few models that is less sensitive to hyperparameters tuning, and you can obtain good performances for Random Forests with the basic JMP Pro recommendation/default settings.

For more info about "tunability" of Machine Learning models (impact of hyperparameters tuning on performances), you can check this paper : [1802.09596] Tunability: Importance of Hyperparameters of Machine Learning Algorithms (arxiv.org)

 

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

3 REPLIES 3
Victor_G
Super User

Re: Bootstrap Forest Platform > "validation" column vs "validation" portion

Hi @LargeDormouse25,

 

Concerning your questions :

  1. Validation portion can be helpful if you didn't specify validation (and test) sets using the Validation role. If you add a validation column with training, validation (and test) sets using the Validation role, the validation portion will be ignored :
    Victor_G_0-1741073873130.png
  2. Validation is mostly used in this platform to enable "Early Stopping", to better assess the impact of number of trees and their depths. The validation set will be used to prevent overfitting by using too many/too deep individual trees in the forest.
    Note that you can fine-tune Bootstrap Forest hyperparameters (as well as other ML models in JMP) by using a Tuning Design Table. See how to setup your table and use it for hyperparameters tuning here : Additional Example of the Bootstrap Forest Platform
    If you used a tuning design table, the list of hyperparameters combinations tested and their results will be displayed in the analysis report :
     
    TuningReport.png

    Some previous conversations about Tuning design table for hyperparameters fine-tuning :
    Is there a way to do k-fold cross validation with boosted tree? 
    Boosted Tree - Tuning TABLE DESIGN 
    Malfunction in Bootstrap Forest with Tuning Design Table? 
    Want to run the same tuning design table for multiple y-variables 

    Note that Random Forest (called Bootstrap Forest in JMP) are one of the most robust supervised learning model available in Machine Learning. Unlike other possible algorithms, it's one of the few models that is less sensitive to hyperparameters tuning, and you can obtain good performances for Random Forests with the basic JMP Pro recommendation/default settings.

For more info about "tunability" of Machine Learning models (impact of hyperparameters tuning on performances), you can check this paper : [1802.09596] Tunability: Importance of Hyperparameters of Machine Learning Algorithms (arxiv.org)

 

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Re: Bootstrap Forest Platform > "validation" column vs "validation" portion

Hi  Victor,

Regarding use of a tuning table for HyperParam search,

I have a mismatch between specified params 

LargeDormouse25_0-1741106857433.png

 

and the params reported by Model Validation.

LargeDormouse25_1-1741106911257.png

So it looks like that "Portion Bootstrap", "Minimum Splits per Tree" and "Minimum Size Split" are not considered in the model validation. Can you confirm these param names are correct in this context ?

 

Victor_G
Super User

Re: Bootstrap Forest Platform > "validation" column vs "validation" portion

Hi @LargeDormouse25,

Please check the following discussion mentioned in my earlier post, you'll find some explanations about the correct naming of hyperparameters, where to find them, and some tables already prepared with the correct naming for hyperparameters tuning : 

Malfunction in Bootstrap Forest with Tuning Design Table?

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Recommended Articles