Solved: Choosing a model for a custom design

Report Inappropriate Content · Jul 6, 2017 07:43 PM

Hi, I'm curious about your thoughts on how you pick a model for a custom DOE (two continuos factors for example). I know if my goal is optimization, I choose an RS model and JMP will add runs closer to center of the design space, but when I examine the design points, I always get the feeling that there are many large gaps between the factors levels chosen by JMP (I check using scatterplot). This worries me that I may miss the sweet spot for optimization. If I increase the number of runs, I end up with more replicates without closing these gaps. I found that I will have to try using higher order models (Powers: 4th or 5th) to fill these spaces.

I'm a little new to DOE, so I'm trying to figure out if this is the right way to think about this or to just trust that I will not miss the sweet spot with an RSM if I make sure certain metrics (variance, power) in the design evaluation look good. Thanks!

Mark_Bailey · Jul 7, 2017 06:10 AM

I expect every experiment to not include the "sweet spot" as a run. Even so, I still always use an experiment every time I need to learn how something works from empirical evidence.

You have encountered a common point of confusion for anyone who is graduating from mere testing to experimenting, so do not feel bad about it. The purpose of an experiment is to obtain the best data to fit the model that adequately describes the response to factor changes. It is not a test. This concept is a major step towards becoming a master, a Greedy Experimenter.

The principles of design work together in any design method. We will use custom design.

You choose a meaningful response or more than one if necessary.
You consider all the factors that affect the response and select a set of them for study. The rest will be held constant throughout the experiment.
You use continuous factors as much as possible. You use the widest range for each factor to ensure a large effect that will have the most power and the most efficient estimation.
You select terms for the linear regression model to represent all the effects of these factors including main effects, interaction effects, and non-linear effects of sufficient order. It is common that second order effects are sufficient over the factor range but higher order terms are allowed.
You select the criterion for optimality: D-optimal to emphasize parameter estimation and testing or I-optimal to emphasize prediction. (There are other criteria for special cases.)
You decide on the budget of runs for this experiment and make the design.
JMP determines the factor levels that are most informative with respect to the model. That is, they collectively provide the best data to support the estimation of the model parameters.

(Note that I purposely ignored many considerations in designing an experiment to illustrate the general thought process.)

That last task is not your job anymore. Individual runs are no longer expected to yield the answer, the sweet spot, as they would in a simple test. This approach is driven by the model, it is model-centric. For example, if you know that there are no non-linear effects of these factors, then any factor levels other than the end points will provide less information about the effect (slope parameter). A Greedy Experimenter would never waste a run with conditions that provided less than the maximum information available.

The sweet spot is determined with an experiment by a prediction of the model. It is not the result of a test of a particular condition. (The prediction must, of course, be confirmed empirically.)

Selecting the model for the design is accomplished by one of several approaches and their associated beliefs:

Start with the largest possible model that includes terms to represent all the potential effects. For example, the default RSM model in JMP includes terms for all the main effects, two-factor interaction effects, and second order non-linear effects. This choice is the most expensive in terms of the number of runs. It will likely not require further experimentation (augmentation of original design for economy), though.
Start with the smallest possible model that includes terms to represent only the main effects. The default model in JMP provides this minimal set of terms. This design will produce biased estimates if higher order effects are present and their estimates are correlated with the estimates of the main effects. These conundrums can be resolved with additional runs for a new model with additional terms during design augmentation.
Start with the model you believe is most likely adequate. You do not use a default model. You choose the main effects, interaction effects, and non-linear effects for each case. Custom design offers a further choice for each term: estimable if possible. This option means that you can include a term in your model even if you are not certain that it is non-null before you run the experiment. The design will not be optimal for one model but optimal over the set of all possible models given the potential terms.

Notice that I never once resorted to considering the factor levels that achieve the sweet spot. I don't have to. I shouldn't. That task is not the role of the design. That task is the role of the model, so let's get the best model we can. That result requires the best design of the data for estimating our model.

I hope that this brief explanation helps. It falls short of full lessons in an education about DOE but hopefully it clarifies where the sweet spot comes from in DOE and how the factor levels should be selected (by the design, not by you).

See JMP Help > Books > Design of Experiments for more information about using custom design with JMP.

See "Optimal Design of Experiments: A Case Study Approach" by Peter Goos and Bradley Jones for a modern approach to experimentation if you prefer to learn on your own.

See JMP Software: Custom Design of Experiments if you prefer classroom instruction with an expert.

View solution in original post

Mark_Bailey · Jul 7, 2017 06:10 AM

I expect every experiment to not include the "sweet spot" as a run. Even so, I still always use an experiment every time I need to learn how something works from empirical evidence.

You have encountered a common point of confusion for anyone who is graduating from mere testing to experimenting, so do not feel bad about it. The purpose of an experiment is to obtain the best data to fit the model that adequately describes the response to factor changes. It is not a test. This concept is a major step towards becoming a master, a Greedy Experimenter.

The principles of design work together in any design method. We will use custom design.

You choose a meaningful response or more than one if necessary.
You consider all the factors that affect the response and select a set of them for study. The rest will be held constant throughout the experiment.
You use continuous factors as much as possible. You use the widest range for each factor to ensure a large effect that will have the most power and the most efficient estimation.
You select terms for the linear regression model to represent all the effects of these factors including main effects, interaction effects, and non-linear effects of sufficient order. It is common that second order effects are sufficient over the factor range but higher order terms are allowed.
You select the criterion for optimality: D-optimal to emphasize parameter estimation and testing or I-optimal to emphasize prediction. (There are other criteria for special cases.)
You decide on the budget of runs for this experiment and make the design.
JMP determines the factor levels that are most informative with respect to the model. That is, they collectively provide the best data to support the estimation of the model parameters.

(Note that I purposely ignored many considerations in designing an experiment to illustrate the general thought process.)

That last task is not your job anymore. Individual runs are no longer expected to yield the answer, the sweet spot, as they would in a simple test. This approach is driven by the model, it is model-centric. For example, if you know that there are no non-linear effects of these factors, then any factor levels other than the end points will provide less information about the effect (slope parameter). A Greedy Experimenter would never waste a run with conditions that provided less than the maximum information available.

The sweet spot is determined with an experiment by a prediction of the model. It is not the result of a test of a particular condition. (The prediction must, of course, be confirmed empirically.)

Selecting the model for the design is accomplished by one of several approaches and their associated beliefs:

Start with the largest possible model that includes terms to represent all the potential effects. For example, the default RSM model in JMP includes terms for all the main effects, two-factor interaction effects, and second order non-linear effects. This choice is the most expensive in terms of the number of runs. It will likely not require further experimentation (augmentation of original design for economy), though.
Start with the smallest possible model that includes terms to represent only the main effects. The default model in JMP provides this minimal set of terms. This design will produce biased estimates if higher order effects are present and their estimates are correlated with the estimates of the main effects. These conundrums can be resolved with additional runs for a new model with additional terms during design augmentation.
Start with the model you believe is most likely adequate. You do not use a default model. You choose the main effects, interaction effects, and non-linear effects for each case. Custom design offers a further choice for each term: estimable if possible. This option means that you can include a term in your model even if you are not certain that it is non-null before you run the experiment. The design will not be optimal for one model but optimal over the set of all possible models given the potential terms.

Notice that I never once resorted to considering the factor levels that achieve the sweet spot. I don't have to. I shouldn't. That task is not the role of the design. That task is the role of the model, so let's get the best model we can. That result requires the best design of the data for estimating our model.

I hope that this brief explanation helps. It falls short of full lessons in an education about DOE but hopefully it clarifies where the sweet spot comes from in DOE and how the factor levels should be selected (by the design, not by you).

See JMP Help > Books > Design of Experiments for more information about using custom design with JMP.

See "Optimal Design of Experiments: A Case Study Approach" by Peter Goos and Bradley Jones for a modern approach to experimentation if you prefer to learn on your own.

See JMP Software: Custom Design of Experiments if you prefer classroom instruction with an expert.

Choosing a model for a custom design

Re: Choosing a model for a custom design

Re: Choosing a model for a custom design

Recommended Articles

Get Going with JMP: Essentials for Using JMP

Getting Started with JMP: On Demand Course