cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
shoffmeister
Level V

Custom Design: Default Number of Runs

I am comparing the two different settings "Minimum" and "Default" for the number of runs in an optimal design.

Could anyone tell me the logic how JMP uses the additional runs when using default instead of minimum? Are these additional runs selected in the optimal fashion? I mean are they chosen in a way to maximize D-/I-optimality? Or does JMP take care to add centerpoints and/or replicates e.g. to make sure that we are able to estimate the Lack-Of-Fit test?

I could not find an answer in the documentation. Thanks for any hints in advance!

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Custom Design: Default Number of Runs

The additional runs are chosen to maximize the optimality criterion. In order to ensure a Lack-of-Fit test, you need to specify either center points or replicate runs in the design generation phase. When specified there, the custom designer is generating an optimal design conditional on ensuring there are the specified number of center points/replicate runs.

The general idea for the default run size is that it's a multiple of the number of levels in the factors and has at least 4 runs to estimate the error based on the specified model.

As mentioned in Brad Jones' blog post David referenced above, I too prefer replicate runs over center points since it's more efficient for estimating the model effects. There's also a blog post on individual run replication.


Some very good advice from David regarding the default and I also almost always find myself using simulate responses to walk through some analyses when creating a design.

Hope that helps.

Cheers,
Ryan

View solution in original post

2 REPLIES 2
David_Burnham
Super User (Alumni)

Re: Custom Design: Default Number of Runs

According to the documentation "this value is based on heuristics for creating balanced designs with a few additional runs above the minimum."  I don't know the detail of the heuristic but if you had 7 minimum runs then the default is likely to be 12, if 13 was the minimum then I'd expect the default to be 16.  It's ensuring some spare degrees of freedom and some semblance of balance.  However, the optimality criteria is with respect to your model specification so particularly for the case when you have a model with only main effects and two-factor interactions the D-optimal design would have no interest in adding centre points. Brad Jones has some strong opinions on the inefficiency of centre points - more here​ but you can still manually add them for the reasons you outlined.  I'd always recommend you "walk-through" the entire process - definitely look at the design evaluation (power analysis, correlations etc) and convince yourself that the design is generating the information that you are looking for; also I think it's good to simulate some data (either with a column formula or using the Simulate Responses option) and verify that you are comfortable with the "analyzability" of the design. Final thoughts - the default is just a safe minimum, not a recommendation;  JMP has no idea how many runs you can afford to run or how noisy your environment is; the power analysis tool is I think a bit awkward to use but is particularly useful for evaluating the benefit of adding additional runs to a custom design.  Uff, a bit random but hope that helps.

-Dave

Re: Custom Design: Default Number of Runs

The additional runs are chosen to maximize the optimality criterion. In order to ensure a Lack-of-Fit test, you need to specify either center points or replicate runs in the design generation phase. When specified there, the custom designer is generating an optimal design conditional on ensuring there are the specified number of center points/replicate runs.

The general idea for the default run size is that it's a multiple of the number of levels in the factors and has at least 4 runs to estimate the error based on the specified model.

As mentioned in Brad Jones' blog post David referenced above, I too prefer replicate runs over center points since it's more efficient for estimating the model effects. There's also a blog post on individual run replication.


Some very good advice from David regarding the default and I also almost always find myself using simulate responses to walk through some analyses when creating a design.

Hope that helps.

Cheers,
Ryan