Solved: Re: RSM vs DSD vs Augmented DSD vs Space Filled Augmented DSD

limac1 · Oct 18, 2019 10:48 PM

Hello Community,

Reaching out because, currently in a dilemma of what model to proceed with. I have ran a 5-factor CCC RSM previously and successfuly constructed quadratic relationships with R-squared values above 0.95. Awesome right? Also it was a half factorial design so 27 runs (1 center point, since they are simulations).

This time around, I'm planning on running another 5-factor RSM or DOE. The caveat is that with the previous RSM alpha points were easily designed. This time around that's not the case, the limits are hard limits. I've considered a CCF or CCI (inscribed) however these two pose risks on either precision or covering the whole design space of interest.

Question: How many runs or what kind of alternate models would you recommend to a CCC? How can I predict linear, two way interactions and quadtratic terms with alternate designs with equal to 27 runs or less with good confidence? I'm afraid of using DSD or Augmented DSD or possibly even a DSD with augmented Space filled DOE because of the differences in the correlation matrix compared to traditional RSM.

CCC is out of the picture because there are designs that are impossible to make.

Thank you guys!

Mark_Bailey · Oct 21, 2019 09:27 AM

No problem. I just want to be clear.

Some central composite designs provide runs that eliminate aliases from the model but not all of them. Custom design will minimize or eliminate correlation between estimates. You also have a choice of optimality criterion and design evaluation information to consider before committing to a particular design.

I do not recommend augmenting a DSD with a space-filling design. First of all, one of the major benefits of using a DSD is that augmentation should not be necessary if the screening principles hold. Second, adding the runs of a space-filling design is an ad hoc approach that does not leverage any information about the original design (DSD). Finally, there is a directed and principled approach: Augment Design. This platform is essentially Custom Design but it starts with runs from a previous experiment and complements these runs as you change parameters (factor ranges, model terms, and so on).

View solution in original post

Mark_Bailey · Oct 19, 2019 12:30 PM

You seem to be bouncing all over the place! Let's calm down and start over.

The ultimate goal is to fit a successful model. Your choice of a model is a multiple regression model using a linear predictor. You can collect any data you want and fit such a model. Your data might pose problems, however, to the fitting, though. So the design attempts to find the best data for the linear model. The design is not immutable, though.

Also, the design principals are inescapable. The design methods are interchangeable. I recommend custom design because it is a modern method that offers the most flexibility. And if it yields a design with impractical or impossible treatments, just change them.

Let's talk about your situation.

That is an enviable R square result in the first case. Did the model confirm?
What do you mean by "the alpha points are easily designed," "the limits are hard limits," and "pose risks on either precision or covering the whole design space of interest?"
Asking "how many runs or what kind of alternative models" is comparing 'apples to oranges.' The sample size is a consideration for the sake of estimability and power. The model is a consideration about bias and variance. What are you asking?
You ask "how can I predict linear, two way interactions and quadratic terms with alternate designs with equal to 27 runs or less with good confidence?" The Estimation Efficiency and Power Analysis can help you answer that question.
The DSD is for screening, so it will succeed only if the screening principles hold. For example, do you expect at most two of the five factors to have active effects?
You say that "there are designs that are impossible to make." Why?

limac1 · Oct 19, 2019 06:55 PM

Thanks Mark for the quick reply, it's hard to explain everything in a few paragraphs so I'll do my best to answer some of the points you made.

That is an enviable R square result in the first case. Did the model confirm?
- Previous RSMs confirmed good predicted and adjusted r-squared values which lined up with test data so the half factorial model deemed great for our purpose.
What do you mean by "the alpha points are easily designed," "the limits are hard limits," and "pose risks on either precision or covering the whole design space of interest?"
- What I mean by alpha points is, for a CCC design, alpha points are "exagerated" designs that help the window of optimization which was fine for the previous RSM but this new one, the design is more complicated and going to those alpha points is now impossible. Say my bottom limit for factor 4 is 0, an alpha point would generate a negative value.
Asking "how many runs or what kind of alternative models" is comparing 'apples to oranges.' The sample size is a consideration for the sake of estimability and power. The model is a consideration about bias and variance. What are you asking?
- Right right, I know that with RSMs there's no aliasing, and no confounding terms up to quadratic relationships. On the other hand, a DSD with augmented design (space filled or whatever way) will confound some of those terms, BUT ultimately can still point predict the same values as an RSM.
You ask "how can I predict linear, two way interactions and quadratic terms with alternate designs with equal to 27 runs or less with good confidence?" The Estimation Efficiency and Power Analysis can help you answer that question.
- Got it I'll make sure to do this once I pick an appropriate model.
The DSD is for screening, so it will succeed only if the screening principles hold. For example, do you expect at most two of the five factors to have active effects?
You say that "there are designs that are impossible to make." Why?
- This one was more of the alpha points, but if it's inside the window of interest and hte model is able to predict within that window really well, then we're golden! I know there will be some degree of confounding terms but if it's able to predict with great confidence like an RSM then I think this is something I would love to explore.
- I've been reading on DSD with space filled DOE to achieve something like an RSM, but I suppose I'm afraid to take that step and try it out because it's a big investment of time.
- An alternative like you suggested, I will play around with the custom design feature to see what what kind of power and correlation matrix compares to the others.

Thank you Mark for your time!

-Carlos

Mark_Bailey · Oct 20, 2019 11:45 AM

I take "confirmed good predicted" to mean that you predicted a new response under a previously untested condition with your model and the result did not contradict the prediction.

The axial points (not "alpha points") are unique to the design method that you used. Central composite designs use axial points to estimate the second-order terms (non-linear effects) for a particular factor. You can always change the factor level to a more practical value after you make the data table. The old CCD method is not that flexible. Custom design does not use axial points for construction and therefore avoids exceeding the original factor range, by the way.

Why do you say that there is no aliasing or confounding of terms with RSM? How do you justify this claim? I do not think you understand the meaning and use of the terms 'alias' and 'confounded.' Also, such concepts only make sense in the old design methods of factorial designs. Correlation is a more general and inclusive term that applies to both the old designs and new designs.

The statement that you will use the estimation efficiency and power analysis once you pick a model is nonsense. They are used to evaluate a design before observing the response data. The model is selected after observing the response, at which point the estimation efficiency and power have no meaning.

Be careful with the "window of interest" mindset. I hope that you do not mean that you select factor ranges so that they are limited by where the best setting is expected. That approach usually produces small effects that difficult to estimate and test.

I have seen another reference to using DSD and space-filling designs together. Can you tell me where you found this idea?

Best of luck.

limac1 · Oct 20, 2019 01:44 PM

Mark,

Thanks for clarifiying that terminology.

What I mean the "window of interest" as the only window that we can create designs. I suppose after using Design expert and setting up RSMs you can evaluate your design and based on classic CCD, no aliasing terms are found for 1st and 2nd order terms. I think that's what stemmed that terminology that I've been accustomed to using and maybe cross confusing.

The DSD + Space Filling, I've been reading on the benefits of Space filling for computational simulations, however purely a Space filling is expensive based on the "10N" rule where N is the number of factors. The combination of DSD and space filling was suggested from a colleague which is why I wanted to reach out and get a sense if that is a path.

I really like the Custom Design by the way so thank you for that suggestion, it's super straight forward to use on JMP. Design Expert has a custom design feature but the constraints are difficult to set.

Mark, I really appreciate your responses! Truly excited to understand them better.

-Carlos

Mark_Bailey · Oct 21, 2019 09:27 AM

No problem. I just want to be clear.

Some central composite designs provide runs that eliminate aliases from the model but not all of them. Custom design will minimize or eliminate correlation between estimates. You also have a choice of optimality criterion and design evaluation information to consider before committing to a particular design.

I do not recommend augmenting a DSD with a space-filling design. First of all, one of the major benefits of using a DSD is that augmentation should not be necessary if the screening principles hold. Second, adding the runs of a space-filling design is an ad hoc approach that does not leverage any information about the original design (DSD). Finally, there is a directed and principled approach: Augment Design. This platform is essentially Custom Design but it starts with runs from a previous experiment and complements these runs as you change parameters (factor ranges, model terms, and so on).

limac1 · Oct 21, 2019 10:29 AM

Thank you Mark, I think the minimization of correlation was fundamental. I’m comparing them side by side right now to the ad hoc approach right now. The custom design is significantly mucj less than the Augmented DSD which gives me confidence to trying this approach.

Thanks again for your time! Looking forward to learning more through your software and these forums!

statman · Oct 21, 2019 11:19 AM

OP,

I'm a bit confused by your statements and requests. Mark has asked some great questions. I hope you don't mind, but I have some other comments/questions and personal musings.

It is extremely difficult to provide advice without understanding the situation. What response variables are you trying to optimize? What factors are being manipulated (and what are the hypotheses associated with choosing those factors?) This is simulation you're running, is that correct? How are you simulating noise? Realize the simulation has an algorithm programmed into it. If the factors you are manipulating are not in that algorithm, they will show as insignificant. Who wrote the simulation software? How confident are you the simulation predicts reality? R-squares do NOT tell the whole stiory. R-squares will increase every time you add degrees of freedom to the model. I'd be interested in the delta between R-square and R-square adjusted. R-square adjusted should be your default reported value. If this is simulation, why do you care about reducing runs? Do you lack computing power?

Thinking of sequential experimentation (Dr. Box's Response Surface Method), you might start with screening designs (large number of x's, predictor variables) to reduce the number of factors of significance (interest). This is baseds on the principles of Scarcity and Hierarchy of Effects. Then you iterate on those that show interest (e.g., practically significant, interesting patterns in the data, unusual compared to your predicttions). Ultimately you might want to optimize and create some predictive polynomial (one that can be useful in the real world, not the simulation). Note: complex, non-linear models are difficult to predict and manage. If you are truly in the optimum space, then perhaps something like Evolutionary Operation (EVOP) might be a better strategy.

Cheers,

Bill

"All models are wrong, some are useful" G.E.P. Box