Discussions

FactorAlligator · Jun 8, 2023 2:07 PM

Hi,

we are currently trying to set up our first custom design but are struggling with the choices. Our problem is that we have 4 groups of factors: 1st group has 2 options, 2nd group 3 factors, 3rd group 4 factors and the 4th group only one factor - so in total 10 factors but they are continuous values. This is where we struggle to set it up. If they would be only categorical then we would just add categorical factors with certain amount of levels.

What we already know from previous experiments:

The concentration of each factor has a quadratic effect on the outcome. Meaning we do not only need to check the interactions but also check the interactions at different concentrations.
Only different factors of the 3rd group should be present at the same time in a run. Thus, we would need to set up constraints to not allow combinations of the 2 factors of the 1st group or the 3 factors of the 2nd group.

I tried a custom design where I added the 10 factors as continuous factors. Then I set up under "Define Factor Constraints" and "Use Disallowed Combinations Filter": factor 1 AND factor 2 (of group 1) OR factor 1 AND factor 2 AND factor 3 (of group 2). I made sure that the min and max values are included but what I cannot understand is that when adding the 2nd interactions the interaction factor 1*factor 2 is included as well as other combinations. So I think there is already an error.

I did not know how to get rid of these unwanted interactions so I took the time to remove them manually from the list of 2nd and 3rd interactions but anyway these combinations are included after I continue to the table.

How do I proceed with the design?

Thanks.

Mark_Bailey · Jan 26, 2022 12:00 PM

Thanks for the clarification. Then I think you might define two factors for groups 1 and 2. The first factor (e.g., Factor 1 and Factor 2 below) identifies the material and the second factor (e.g., Factor 1 Level and Factor 2 Level below) identifies the concentration. I would use the default coded levels -1 and +1 for concentration because the actual concentration ranges for each material will vary by a lot. Use the appropriate low and high level concentration for each individual factor (e.g, A or B). You are screening so another experiment with the selected materials can optimize the actual concentration. It might look like this:

I am not sure, though, about group 3 (Factor 3). You might have any one or any two (or any three?) of the four factors involved in the same run, is that correct? You might need to enumerate all the combinations here.

The interactions will cause the number of runs to go up quickly. You will have to think carefully about the potential interactions between groups and include the plausible ones but exclude the ones you can rule out now to keep the number of runs manageable. (There are other ways but you should always use the prior knowledge you can trust or vet first.)

(Note that I do not recommend using 0 for a level of a continuous factor because that way makes it essentially a categorical factor (absent, present) instead of always present but to varying levels.)

You will likely use the model only to test effects but not to optimize the factors settings at this stage, so this approach might work well for you.

I know that it is still messy.

View solution in original post

statman · Jan 26, 2022 09:56 AM

Welcome to the community. I've read your post and I will have to admit, without understanding the situation better it is difficult to provide specific advice. I have some questions/comments:

1. How do you know there are quadratic effects? Usually when we do screening, we are looking to develop models following scarcity and hierarchy principles.

2. I'm not sure what a "group" is? Is this experiment on sequential steps of a process? If so, you may be able to take advantage of split-plot designs.

3. Since you suspect non-linear effects, perhaps use a definitive screening design to start with?

4. Are you trying to economize the experiment significantly? If not, perhaps the effects estimated for the interactions that are non-sensical can be used to estimate the MSE?

Personally, I believe in sequential experimentation. I follow the KISS principle (Keep It Simple and Sequential). Don't try to get too much from any one experiment. Ensure your experiment provides guidance for setting up the next experiment. Screening designs are intended to help start the journey.

"All models are wrong, some are useful" G.E.P. Box

FactorAlligator · Jan 26, 2022 11:06 AM

Thank you for the answer statman.

It might be worth mentioning that there is no need for any factor to be present but to figure out which factor to include and at what amount. To answer your questions and specify the problem:

1. We run some preliminary experiments where we included only one factor and saw that there is a quadratic response. Now I thought that this is true if you screen combinations. Do you think that I should use the best conditions for each factor determined in the single factor screen to get an understanding on the interactions? I would assume that the best condition for a single factor is not necessarily the best condition when you combine it with different factors.

2. A group is a set of factors that share the same properties and in case of 2 groups cannot be combined.

3. I am not sure how this would help me on restricting certain combinations.

4. No, actually you are right. That is a good idea.

I think the main problem with setting up the design is that I want to have two levels at the same time: I want to figure out the most important factors of each group and how they interact but at the same time figure out how they interact at different concentrations. Secondly, I do not want to have every factor present in each run.

statman · Jan 26, 2022 11:40 AM

Again difficult because I don't understand the situation well enough. Can the outputs be measured after each group or is the response only measured at the end or all groups?

I provide the following advice;

1. Don't use one factor experiments to conclude anything. The inference space is entirely too small, interactions with other design factors and noise are impossible to estimate. In fact, what may appear as a quadratic could be an interaction or a noise effect.

2. Start with your hypotheses, your explanations as to how and why each factor would contribute to the effect on the response. Use your hypotheses to develop models.

3. Create multiple experiment designs and evaluate each one for what potential information they will provide (what can be estimated), what will be restricted (inference space) and what will be confounded (perhaps higher order effects). Contrast this with the resources required for each design.

4. Predict ALL possible outcomes from each design and what subsequent action you will take. Predict the rank order of model effects.

5. Then pick one and run it with the expectation this is the beginning of your journey, not the end. There is NO one shot solution.

Realize many optimal designs, as they are called, have very complex confounding and it is challenging to determine how to iterate when things don't make sense.

"All models are wrong, some are useful" G.E.P. Box

Mark_Bailey · Jan 26, 2022 10:52 AM

Why are the factors grouped? What is the meaning or purpose of each group?

Do you mean that only one factor from group 1 and group 2 may be present but more than one factor in group 3 may be present?

Is the concentration range for the factors in each group the same?

FactorAlligator · Jan 26, 2022 11:10 AM

Hi markbailey,

also thank you for your answer.

Why are the factors grouped? What is the meaning or purpose of each group?

The factors are grouped due to similar properties and due to the fact that I can only include 1 factor of each group except for group 3. Yes

Do you mean that only one factor from group 1 and group 2 may be present but more than one factor in group 3 may be present?

Yes, exactly.

Is the concentration range for the factors in each group the same?

No, some differ by a factor of 100.

Mark_Bailey · Jan 26, 2022 12:00 PM

Thanks for the clarification. Then I think you might define two factors for groups 1 and 2. The first factor (e.g., Factor 1 and Factor 2 below) identifies the material and the second factor (e.g., Factor 1 Level and Factor 2 Level below) identifies the concentration. I would use the default coded levels -1 and +1 for concentration because the actual concentration ranges for each material will vary by a lot. Use the appropriate low and high level concentration for each individual factor (e.g, A or B). You are screening so another experiment with the selected materials can optimize the actual concentration. It might look like this:

I am not sure, though, about group 3 (Factor 3). You might have any one or any two (or any three?) of the four factors involved in the same run, is that correct? You might need to enumerate all the combinations here.

The interactions will cause the number of runs to go up quickly. You will have to think carefully about the potential interactions between groups and include the plausible ones but exclude the ones you can rule out now to keep the number of runs manageable. (There are other ways but you should always use the prior knowledge you can trust or vet first.)

(Note that I do not recommend using 0 for a level of a continuous factor because that way makes it essentially a categorical factor (absent, present) instead of always present but to varying levels.)

You will likely use the model only to test effects but not to optimize the factors settings at this stage, so this approach might work well for you.

I know that it is still messy.

FactorAlligator · Jan 31, 2022 06:36 AM

Thank your for the suggestions.

Splitting up the properties of the factors in two is a really good idea. We will now try this as you recommended with group one. Your warning about setting the factor to 0, actually is a good idea. By this we know if the factor needs to be present and by using a RSM we also get information about different concentrations by the inclusion of the midpoint. If some interactions are concentration dependent we could also estimate them, right? Do you agree that this could be evaluated by this design?

Mark_Bailey · Jan 31, 2022 10:49 AM

Yes, but you understand that the concentration is really coded, and therefore, specific to each factor involved.

FactorAlligator · Feb 1, 2022 02:01 AM

Actually, I do not understand what your mean by "the concentration is really coded". Could you please elaborate or provide further reading on this? Thank you.

Discussions

Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Re: Setting up an screening design

Recommended Articles