Discussions

dtsang · Jun 8, 2023 1:58 PM

I have 7 continuous factors and 2 discrete numeric factors (presence/absence). I would like to use DOE to identify which factors play a role in the response, identify if there are any interactions between factors and finally determine the range of factors for optimal response.

I can think of two ways to do it and I am not sure which one gives me a more accurate model to identify factors and interactions.

1) I remembered from my DOE classroom training 2 years ago that JMP prefers users to use Custom design now. I used Custom Design and looked at Main Effects and 2nd interactions of all 9 factors. I have 52 runs. Then I need to set up another DOE with RSM design to identify quadratic effects.

2) I used DSD and came up with 25 runs. After analyzing the data from DSD, I should be able to narrow down to a few key factors (for example, 4-5 factors), I will augment design with these 5 factors and add 2nd interactions and quadratic effects. I will fix the other factors using the analysis from DSD. Then I need to set up 9 extra runs and group them in a new block.

My second approach allows me to have 18 less runs (52 vs. 25+9) and have the flexibility to change my range for certain factors in my augment design. I understand that if I change the range for certain factors, I need more runs to lower prediction variance.

I can only see advantages of doing Augment Design. What is the advantages of doing a custom design with all the factors all at once? When should I use Augment Design vs Custom Design?

Thanks.

Mark_Bailey · Sep 17, 2019 07:11 AM

I don't know where you got the idea that for (1) you need a custom design with 52 runs that requires an augmentation. Similarly, I don't know where you got the idea that for (2) you must augment the DSD to estimate more than the main effects.

The DSD is a special case of the alias-optimal custom design for the RSM model. That is how the DSD was discovered.

You might not have to augment the DSD. If no more than 3 factors are active, then DSD for 9 factors will support the full quadratic model, which is often sufficiently accurate over the range of the response. Adding extra runs will increase both the power of the and the coverage of the design because you are effectively adding 'fake' factors. That change makes the 'sparsity of effects' principle more likely.

The custom design is still the best choice when one of the special design methods (e.g., DSD) is not appropriate or capable. For example, if you have hard to change factors, then you should not use the DSD. It also offers other opportunities for screening. You could specify the RSM model and then change the estimability of all the higher order terms from 'necessary' to 'if possible.' This definition will drastically reduce the minimum number of runs. Then add back 3-4 runs for every potential term that you expect to be active. For example, if I expect two non-linear effects and three interactions with your 7 continuous and 2 categorical factors, my custom design would be 10 + (2+3)*4 = 30 runs. I should not need to augment this design in the future. This Bayesian I-optimal design exhibits very high power if the effect size is at least twice the standard deviation of the response.

The only way to assess the accuracy of the model is to confirm its predictions. I recommend at least two predictions: settings for the optimal response and settings for a poor response. I trust in a model a lot more if it can predict the response anywhere.

Finally, be sure to always use a wide factor range for all the continuous factors to assure the maximum power and minimum estimation standard error. Do not change, limit, or decrease the factor range because you think you know where the optimum setting is in any experiment. Why add extra runs when a wide factor range gives you more empirical data and experience in a more economical design? Why localize the model?

View solution in original post

Mark_Bailey · Sep 17, 2019 07:11 AM

I don't know where you got the idea that for (1) you need a custom design with 52 runs that requires an augmentation. Similarly, I don't know where you got the idea that for (2) you must augment the DSD to estimate more than the main effects.

The DSD is a special case of the alias-optimal custom design for the RSM model. That is how the DSD was discovered.

You might not have to augment the DSD. If no more than 3 factors are active, then DSD for 9 factors will support the full quadratic model, which is often sufficiently accurate over the range of the response. Adding extra runs will increase both the power of the and the coverage of the design because you are effectively adding 'fake' factors. That change makes the 'sparsity of effects' principle more likely.

The custom design is still the best choice when one of the special design methods (e.g., DSD) is not appropriate or capable. For example, if you have hard to change factors, then you should not use the DSD. It also offers other opportunities for screening. You could specify the RSM model and then change the estimability of all the higher order terms from 'necessary' to 'if possible.' This definition will drastically reduce the minimum number of runs. Then add back 3-4 runs for every potential term that you expect to be active. For example, if I expect two non-linear effects and three interactions with your 7 continuous and 2 categorical factors, my custom design would be 10 + (2+3)*4 = 30 runs. I should not need to augment this design in the future. This Bayesian I-optimal design exhibits very high power if the effect size is at least twice the standard deviation of the response.

The only way to assess the accuracy of the model is to confirm its predictions. I recommend at least two predictions: settings for the optimal response and settings for a poor response. I trust in a model a lot more if it can predict the response anywhere.

Finally, be sure to always use a wide factor range for all the continuous factors to assure the maximum power and minimum estimation standard error. Do not change, limit, or decrease the factor range because you think you know where the optimum setting is in any experiment. Why add extra runs when a wide factor range gives you more empirical data and experience in a more economical design? Why localize the model?

dtsang · Sep 22, 2019 07:19 PM

Thank you. I have a better understanding in planning DOE now.

When I first learned about DOE, I was taught to use Custom Design or DSD to look at main effect and interaction to rule out a few factors. Then I will use Response Surface model to identify if there is non-linear effect and maximize the response. I always followed this approach regardless of the number of runs I could do or number of factors I was testing. After reading your comments, I think I can do it all in a single Custom Design and understand why I can do it all. Our team is going to do 32 runs and see how it goes.

Thanks for reminding me to use a wide factor range. That is one thing I always keep in mind and telling others as well.

Discussions

DOE: All out custom design or augment design after DSD?

Re: DOE: All out custom design or augment design after DSD?

Re: DOE: All out custom design or augment design after DSD?

Re: DOE: All out custom design or augment design after DSD?

Recommended Articles