Discussions

anne_sa · Jun 8, 2023 2:01 PM

Hi everybody,

I am quite a beginner regarding DOE and I have a small question.

Basically let's say that I have an experiment with several factors. In particular I want to test a component which has two possible levels and I would like to test several concentrations ( 0, 10, 20 or 40). Therefore I created my design with a categorial factor (Component: A, B) and a discrete numeric one (Concentration: 0, 10, 20, 40).

In the final design, I have several runs with either A-0 or B-0. However, from a biological point of view it is exactly the same thing => "no component added". I wonder if there is a way to provide this information to the software? How to define that two runs are equivalent? Or maybe should I define the factors differently? Or maybe it is just not feasible..

Thanks in advance for your inputs!

SDF1 · Nov 10, 2020 01:29 PM

Hi @anne_sa ,

From what it sounds like, you have one factor that is discrete numeric, call it X1 with four levels. You also have another factor that is a two-level categorical, call it X2. X1 can take on values [1, 2, 3, 4] these are the coded values, not the actual experiment values that you would really use. Similarly, your X2 categorical variable takes on [0, 1], which are again the coded values -- in the experiment, X2 is going to actually be A or B. Putting this into a typical custom DOE, you should get 8 runs where X1 varies from 1 to 4 and X2 will either be 0 or 1, will also get the interactions as well. The interpretation then is that 0=A and 1=B.

On the other hand, if you have two components A and B that can both either be present (1) or absent (0), then you'll want to have an additional categorical variable X3, which also takes values [0, 1], but now the [0, 1] means off or on, respectively. If you now generate the new DOE with three factors, you'll need at least 16 runs to also get the interactions. Again, in this case, X1 runs from 1-4 and the other two categorical variables for component A or B will either be on, off, or mixtures of on/off, e.g X2=0 (off), X3=1(on). Adding the additional term allows you to see if there are any interactions between component A and component B, along with concentration.

These are not necessarily optimal DOEs, especially for minimizing correlation of the interactions, etc. You might need to have more center runs, run replicates or change the design to what works best for you.

I hope I understood your issue correctly.

Hope this helps!,

DS

Mark_Bailey · Nov 10, 2020 01:55 PM

Welcome to DOE!

I am not a fan of continuous factors set to 0. As you discovered, that means you do not have that factor. I think it might be better to have all non-zero levels for concentration. See how that idea sits with your purpose.

P_Bartell · Nov 10, 2020 1:33 PM

To expand upon @Mark_Bailey 's comment (and I have the same opinion) on not being a fan of setting continuous factors at 'zero' as not in the system at all,,,,too often the system under study acts like a new or atypical system with one key factor completely absent. The absence of the factor might influence parameter estimation in an untoward way. It might set up a safety or known abject failure within the process that doesn't provide useful information wrt to the problem at hand. Or some other unanticipated failure mode.

Lastly, what does you engineering or first principles knowledge (if you have any) tell you about what might happen with the factor completely absent?

And I hate to even suggest this...but if you are bound and determined to find out what happens at 'zero' for that factor, maybe it's time for a simple single home run type trial(s) at the 'zero' and watch what happens? And not include those combinations in the analysis? But again...only if you are certain you're not going to induce a safety or hazardous condition.

statman · Nov 10, 2020 08:29 PM

Echoing my esteemed colleagues, I first want to welcome you to the world of DOE. It is a fascinating and incredibly powerful topic and tool. Unfortunately, advice regarding design selection is dependent on the situation. What hypotheses do you want to get insight into? The factors will represent those hypotheses in your experiment. Realize there is a difference between a test designed to "pick a winner" and an experiment which is designed to give insight into causal relationships. This factors into the factor selection, the level setting and the conditions under which you perform the experiment (inference space). My advice is to think through the treatment combinations (factors and levels) and try to predict what results you would expect and what results could be possible. Predict what all possible outcomes could be and what would you do in each instance. Then proceed. For example, if factor X1 is significant what would you do? If that factor is insignificant, what would you do? (for all factors).If you run the experiment and no practically significant variation is "created" what will you do? What if you create a bunch of variation, but nothing is assignable to the factors?...etc.

Focus on what information you want and then what design fits that. Whenever I see factors et at more than 2 levels I think about a pick the winner objective. You certainly aren't predicting a third order polynomial? So why 4 levels of a continuous variable?

Remember there are always two thing about experimentation, one is about understanding the phenomenon, the other is about learning how to experiment.

Sorry for my rant. Carry on. And please come back and ask questions!

"All models are wrong, some are useful" G.E.P. Box

anne_sa · Nov 12, 2020 07:36 AM

Hi,

Many thanks @SDF1 , @Mark_Bailey , @P_Bartell and @statman for your comments. Each input is really valuable and help me to better understand the whole process.

I think I will follow your advice and not include this 0 level in my design.

Regarding the number of levels for the X2 factor, people familiar with the experiment field wanted to have several "intermediate" levels (4 or even more) because they expect a quadratic behavior but they do not know exactly at which point the response will start to decrease. However from what I understand, two levels are enough for a screening step, and 3 if we want to add a quadratic effect in the model and "pick the winner". Therefore is there any gain to add intermediate levels? Does it help to catch the exact "rupture" point? Or will it just decrease the precision of the results?

Thanks again for your help!

Mark_Bailey · Nov 12, 2020 08:09 AM

You were correct to add a Discrete Numeric factor when you wanted to dictate the exact factor levels. (Then JMP decides on the terms that must be added to the model.) The best practice for covering the possibility of a non-linear response to the change in a factor level is to add a Continuous factor and specify the terms necessary in the Model. (JMP will decide the best levels to estimate your model parameters.)

Also keep in mind to choose a wide range for this continuous factor. Do not 'second guess' where the best level will be and limit the range in the neighborhood of such a guess. As mentioned before, this study is not about 'testing' to 'pick the winner.' It is about estimating a model to find the winner. The best estimates require a large effect, which is produced by a wide factor range.

P_Bartell · Nov 12, 2020 08:28 AM

To add to @Mark_Bailey 's latest comment, to implement what he's recommending, you'll need to use JMP's signature Custom Design platform. You'll be able to input the specific high and low levels you want to cover in your experimental factor space and the exact model you would like to estimate...but as Mark notes, there will be other levels selected by the Custom Design algorithm that MAY not be nice 'round numbers' that some on your team, may naively, think are required. As Mark suggests the goal is to find a useful model here...not the winner. You should consider using JMP's signature Prediction Profiler and simulation capabilities as well to help you find and explore in the quest for that 'useful model'.

Other steps involved in problem solving where DOE is central to the effort include confirmation runs, or additional experimentation to enhance the model or your understanding of the system. Confirmation runs are all but required for response optimization problems. Additional experimentation might be recommended to explore robustness of the solution, OR suppose you miss the inflection point that seems so central?

Problem solving by DOE is quite often not finished with one single design...I recommend you and your team embrace the concept of sequential experimentation as central to your problem solving methodology.

anne_sa · Nov 13, 2020 02:27 AM

Thanks again for all your precisous advices! I will try to keep all of that in mind when doing a new design!

matteo_patelmo · Nov 13, 2020 05:21 AM

Realize there is a difference between a test designed to "pick a winner" and an experiment which is designed to give insight into causal relationships.

insightful.
Matteo

Discussions

DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Re: DOE question

Recommended Articles