cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Egon_Gross1
Level II

Malfunction in Bootstrap Forest with Tuning Design Table?

Hi all,

I have a case where I want to apply a Bootstrap Forest approach in combination with a tuning design table. While following the instructions in JMP-Help I ran into the problem that only two out of the six described hyper parameters in the tuning table made it into the nunmbers of models build according to the run number of the tuning table.

Was this issue identified by someone else, too?

I already discussed this with @Jonas_Rinne  and we didn't came up with a satisfying solution. Or is it a bug?

 

3 ACCEPTED SOLUTIONS

Accepted Solutions
SDF1
Super User

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Egon_Gross1 ,

 

  The tuning table and Bootstrap Forest platform are functioning as intended and being used somewhat correctly.

 

  So, first of all, the output that you show for your model fitting is the standard output for the bootstrap forest platform where it lists in the Model Validation-Set Summaries table only N Terms, N Trees, N Trees Specified, and RSquare. The other parameters are listed below within the "Specifications" outline box. This is where you'll find the Minimum Splits per Tree and Minimum Size Split parameters for the optimal model only. JMP does not save all the data for each run of the tuning table. If you want to do that, you'll have to write your own JSL code. I have done this and have made a generalized automated tuning program within JMP so that I can run models using different platforms and it records the goodness of fit values for each run of the model and then you can actually visualize the results to then narrow down which fit might actually be better -- having super high R^2 values is not always the best model as it might not be as adaptive to changes. Sometimes looking at other metrics is better.

 

  Anyway, I did mention that the tuning table was being use "somewhat" correctly. What I mean by that is that you really should use the tuning tables I sent as factors to load into a space filling design. Go to DOE > Special Purpose > Space Filling Design then click on the red triangle and select Load Factors. Specify the number of runs you want to do and then click Fast Flexible Filling, and you'll get a new data table that is space filled to explore the hyperparameter space. Please note that in the templates I sent, the first row is the lower settings of the DOE while the second row is the high settings of the DOE. When you then run the modeling platform (NN or bootstrap forest), you would select this new fast flexible filling design as the tuning table.

 

  BTW, I'm happy to share the JSL generalized tuning code with you if you'd like., just let me know. I gave a presentation on it for the JSL scripters club a while ago. It has it's limitations, but it's also very useful.

 

Good luck!,

DS

View solution in original post

SDF1
Super User

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Egon_Gross1 ,

 

  Attached is the JSL code to run the generalized tuning for several of JMP's modeling platforms.

 

Thanks!,

DS

View solution in original post

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Egon_Gross1 Indeed I checked with the platform developer, and they confirmed that BF is in fact using all the specified tuning table columns as @SDF1 commented here, but it just isn't reporting them.  We are working on a fix for this, and it should be incorporated in the JMP 18 development cycle (after the initial release of JMP 18.0 later in Q1 of 2024).    Cheers, @PatrickGiuliano (JMP Tech Support)

View solution in original post

12 REPLIES 12

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

There seems to be some inconsistency between the help page and the scripting index. Here is what the help page suggests as column names for the different hyperparameters in the tuning table: 

 

Jonas_Rinne_0-1704990842533.png

In the scripting index the columns are named slightly different (Split/Splits): 

Jonas_Rinne_1-1704990965431.png

Does someone had the same issue before? Did we used the wrong column names or is it something else? The first two hyperparameters work just fine.

 

Victor_G
Super User

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Jonas_Rinne and @Egon_Gross1,

I have used a Tuning design table for fine-tuning a Boosted Tree (similar hyperparameters) on this topic last year : https://community.jmp.com/t5/Discussions/Boosted-Tree-Tuning-TABLE-DESIGN/m-p/609591/highlight/true#...

I am not behind my computer so this is the best I can do now, but I can do some tests tomorrow if it can help (and if a solution is not found before by someone else). The naming of the columns is quite sensitive, so this might explain why some hyperparameters columns are used and not others.

I hope this might help in the meantime,
Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
SDF1
Super User

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Egon_Gross1 and @Jonas_Rinne ,

 

  I do model tuning with tuning tables all the time and even made templates for NN, boosted trees, bootstrap forest, and XGBoost.

 

  I've never had an issue with hyperparameters not being brought into a tuning table, except when they are constant. I usually use the space filling design DOE platform along with my tuning template tables to generate the larger tuning table with all the runs.

 

  One possibility could be the name, the other is that you will likely need to make sure that the columns also have the correct Design Role associated in their meta data. If a column is missing that role, then the DOE platforms might not function correctly.

 

  It appears as if the JMP help pages have the correct column names associated with the role for a tuning table. Attached are my template data tables, I hope they can help.

 

DS

 

  

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

@Egon_Gross1 and @Jonas_Rinne  - Thanks for your question, I am also looking into this from the JMP Technical Support Side now. It seems that the tuning table you shared in your example with us is consistent with how @SDF1 specified it, the only difference being that theirs does not appear to have the Coding Column Property specified for each Column.  I tried it both ways with your provided example on the same random seed (=1234) and it seems to make no difference.  I'll keep investigating on my side and will provide an update as soon as I can gather additional useful information.  Cheers, @PatrickGiuliano 

Egon_Gross1
Level II

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @SDF1 and @PatrickGiuliano and @Jonas_Rinne and @Victor_G ,

thanks a lot for all of your contributions and support.

Trying different approaches - except the NN attempt (by now) - the provided Tunung tables worked, except the one for the RF-procedure. I tried this with several tables from the sample folder. After using the tuning table, the Model Validation-Set Summaries section contained only the columns "N Terms, N Trees, N Trees Specified and RSquare". The other tuning parameters didn't show up and I doubt that they were used.

Regards

Egon

SDF1
Super User

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Egon_Gross1 ,

 

  I've never had a problem using the template tables I uploaded before. How exactly are you using the tuning table or generating one from the template table? Again, if a tuning parameter is constant, the DOE platforms will ignore that parameter and not include it in a tuning table. It's unclear to me why you're having an issue as those templates work just fine for me every time I've used them. If you can provide screenshots of each step you're doing in the process, that might help in figuring out what the root issue is.

 

Thanks!,

DS

Egon_Gross1
Level II

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @SDF1 ,

unfortunately, I can reproduce the results. To enable you a recap I added a pdf showing my steps for a regular bootstrap an for the one using the tuning table. If there is something suspicious, I apprechiate your assistance.

KR

Egon1

SDF1
Super User

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hi @Egon_Gross1 ,

 

  The tuning table and Bootstrap Forest platform are functioning as intended and being used somewhat correctly.

 

  So, first of all, the output that you show for your model fitting is the standard output for the bootstrap forest platform where it lists in the Model Validation-Set Summaries table only N Terms, N Trees, N Trees Specified, and RSquare. The other parameters are listed below within the "Specifications" outline box. This is where you'll find the Minimum Splits per Tree and Minimum Size Split parameters for the optimal model only. JMP does not save all the data for each run of the tuning table. If you want to do that, you'll have to write your own JSL code. I have done this and have made a generalized automated tuning program within JMP so that I can run models using different platforms and it records the goodness of fit values for each run of the model and then you can actually visualize the results to then narrow down which fit might actually be better -- having super high R^2 values is not always the best model as it might not be as adaptive to changes. Sometimes looking at other metrics is better.

 

  Anyway, I did mention that the tuning table was being use "somewhat" correctly. What I mean by that is that you really should use the tuning tables I sent as factors to load into a space filling design. Go to DOE > Special Purpose > Space Filling Design then click on the red triangle and select Load Factors. Specify the number of runs you want to do and then click Fast Flexible Filling, and you'll get a new data table that is space filled to explore the hyperparameter space. Please note that in the templates I sent, the first row is the lower settings of the DOE while the second row is the high settings of the DOE. When you then run the modeling platform (NN or bootstrap forest), you would select this new fast flexible filling design as the tuning table.

 

  BTW, I'm happy to share the JSL generalized tuning code with you if you'd like., just let me know. I gave a presentation on it for the JSL scripters club a while ago. It has it's limitations, but it's also very useful.

 

Good luck!,

DS

Egon_Gross1
Level II

Re: Malfunction in Bootstrap Forest with Tuning Design Table?

Hello @SDF1 ,

thanks for your clarifications and comments, which helped me really a lot!

I assumed that the presentation of the results were somewhat similar to the other platforms, which was obviously wrong. Using your tuning table also worked well and I got a "Model Validation-Set Summary"-table for all the defined runs of the tuning table.

Maybe it is interesting for @PatrickGiuliano and @Jonas_Rinne to implement the "full version" of all quality measure results in an according table based on your JSL script. 

According your offer to get the JSL, I would aprechiate this very much - thank's in advance.

Thanks for all the support and ideas.

KR

Egon1