Hi,
I want to simulate some responses from a model I have built using the simulator in Fit Model. As I have only fixed factors, I want to generate a distribution of simulated responses by adding random noise to the response. My questions are
1. What exactly is the default random noise value for the response provided by JMP? (On one help page I saw this was the Root Mean Square Error but I'm not sure how to interpret this in practical terms.)
2. How do I capture measurement error in the response when simulating? i.e. do I replace the default random noise value with a std dev for measurement error or, do I combine the default random noise value with the measurement error value (perhaps by taking the sum of the variances for each and then square rooting?)
Thanks in advance
Hi @Alicia ,
If you have measurements on multiple batches that were made under same conditions/recipes, then you should already have a good estimate of the batch to batch variation. Also, if you have multiple measurements on a single batch, then you should also have an estimation of the noise from the measurement system itself.
From that, once you have your prediction formula, you can create a new column formula that is the prediction formula plus a normal distribution that is centered at zero with a standard deviation that is the square root of the batch to batch variance plus the measurement variance.
It should look something like below -- I just used the Diabetes.jmp file to create an example prediction formula and then added a Random Normal (mu,sigma) function to add noise. This would add a normally distributed random number with mean 0 and standard deviation sigma to your prediction formula.
Hope this helps!,
DS
What do you mean by "fixed factors?" They are categorical? If they are continuous factors, they might in practice vary. If they do, then you can include this variation in your simulation, too.
JMP supplies the RMSE from the currently selected fitted model for the response SD. It is assumed to be constant for all response levels and independent of the model predictors.
There are just 2 factors in the model and they are both categorical. I basically have a 2x2 matrix where I have results for 3 of the combinations and I'm modelling the results for the 4th combination. When simulating responses for the 4th combination I want to add some noise to the response:
1. coming from batch-to-batch variation (does JMP estimate this as RMSE?) and
2. coming from measurement error.
My question is.. how do I do this in the profiler - simulator platform? Do I use the RMSE JMP estimates and then add the measurement error to this (by combining variances and taking the square root)?
Help much appreciated!
Yes, use the Prediction Profiler. Set the levels of the two categorical factors that define the condition that you want to simulate. Activate the Simulator in the Prediction Profiler. Leave the factors as they are. Change the Responses information in the Simulator to represent the independent random noise for your purpose. JMP supplies the RMSE from fitting the current model as only a default value (convenience), but you can enter another SD that you determined elsewhere that includes all the sources of variation, some of which might not have been included in the data you are fitting (e.g, multiple batches). Now click the Simulate button under the Profiler on the right side.
Are you asking how to determine the SD for your simulation?
Hi,
I wondered ho to determine the SD for the simulation.
Do you have any guidance on this?
Thanks in advance
Hi @PYS,
Welcome in the Community !
It might be better to ask for help in a new post, as you'll get more views (and responses) and since this post is already a few months old, people may not see it. It helps also to better track the questions and corresponding solutions for other members.
To answer your question, here are some thoughts about the use of Simulator :
For the use of the Simulator, you have the choice to add noise in your inputs/factors, and/or also on your outputs. There might be different options :
I hope this answer will help you figure out how you can use the Simulator for your needs.
Don't hesitate to create a new post if you have more precise questions on this, or a dataset to illustrate your problem.
Hello @Victor_G ,
Thanks for the reply and the posting tips :)
Hi @Alicia ,
I think I understand what you want to do, but not 100% sure on it.
Do you know the correlation matrix for your factors and response(s)? If so, you can use this to make a better estimate of variance and so forth when doing your simulation. Also, do you have a good estimation of the variance or standard deviation of the factors and response?
I ask because if you have a prediction formula (either saved to the data table or within a Fit Model report), you can select Simulator under the red hot button for the profiler. This will allow you to modify the mean and standard deviation of the model factors -- either as independent normal noise or as multivariate correlation. You can then generate a large data table that is all simulated data with the right kind of correlation structure and noise.
In answer to your questions, RMSE can be very simply interpreted as the standard deviation of the actual data to the estimator. This is an overly simplified explanation, but it's one that is conceptually easy to understand.
For the second question, this should be captured by the noise of the factors going into the model. The noise from the factors will translate into noise in the output. Again, it will be an estimate, but if you know the noise in your factors well, and your model is good, you should have a good estimate for your response noise.
Hope this helps!,
DS
Hi @SDF1 , thanks for your response and explanation of RMSE. I am using just categorical factors in the model and just want to capture noise coming from the response in the simulator... 1. as batch-batch variation & 2. as measurement error. (see notes above).
Really appreciate your comments
Hi @Alicia ,
If you have measurements on multiple batches that were made under same conditions/recipes, then you should already have a good estimate of the batch to batch variation. Also, if you have multiple measurements on a single batch, then you should also have an estimation of the noise from the measurement system itself.
From that, once you have your prediction formula, you can create a new column formula that is the prediction formula plus a normal distribution that is centered at zero with a standard deviation that is the square root of the batch to batch variance plus the measurement variance.
It should look something like below -- I just used the Diabetes.jmp file to create an example prediction formula and then added a Random Normal (mu,sigma) function to add noise. This would add a normally distributed random number with mean 0 and standard deviation sigma to your prediction formula.
Hope this helps!,
DS