Hi @Mark_Bailey and @Peter_Bartell,
Just to provide some feedback, and continue the discussion on this topic, I was able to get some reasonable model data using the ARIMA platform. It does a good job of recreating a similar data set, although not good at forcasting, but that's OK. Applying the prediction formula from this model works fairly well with my actual data set, so it does provide a potentially useful path forward.
I'm not convinced this is ultimately how I wish to proceed, so I want to compare it to the other simulation options within JMP. I know that the stochastic simulation from the profiler doesn't help for reasons we've discussed before. Additionally, it's not the best approach for tuning the model for improved fitting, which I would like to do.
For my specific modeling, I'm using a bootstrap forest model and would like to improve it by tuning the model. I have done so in part with a tuning table. There are some limitations in this regard though, particularly with the number of runs I can afford (memory-wise) in the tuning table. I typically run my modeling with as many runs in my tuning table that I can afford. This narrows down on the number of trees in the forest, number of terms sampled per split, bootstrap samples, minimum splits per tree, and minimum size split. This might help to optimize the parameters of the model platform for finding a solution, but it doesn't necessarily mean an optimized model based on the data I have.
When I build the model, I create a validation column that I break into a training and test set, stratified by my response column. I then build the model on only the training data with a certain % held back for validation of the model. The model output depends on the validation column I choose (I've made several). A typical output is seen below (sorry I can't share what the info is for the response or factor columns). The details of the boostrap forest specifications change depending on how the original training and validation data was stratified.
What I'm interested in doing now is tuning the model for an improved fit possibly by using the simulate option for a model statistic. I could do this by changing the N for the training and validation sets, or by RMSE of the individual trees, etc. I know that I won't be able to use this approach to change the specifications of the bootstrap forest (the tuning table does that), but I would like to use it in such a way that I can tune the model, e.g. the number of samples used for training and validation, or the number of splits, in order to maximize r^2.
To summarize: I would like to use a simulation method to help train the data. I know the simulate platform from the profiler doesn't work, which leaves me with ARIMA (possible) or the bootstrap simulate option for a model statistic. I'm not sure how to best use the latter simulation method for tuning the model.
Any thoughts or feedback you might have on tuning models through simulations would be appreciated.
Thanks in advance for your time!