cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
JuliaL
Level I

How to manage linked factors in DOE ?

Hello everyone,

 

I'm new in the community, sorry by advance if I post at the wrong place or if the question has been already answered. My issue concerned factors of DOE when they are not independant. In my example, it concerns the pressure step. For confidentiality reasons, I remove steps of the process and simplify the "recipes".

 

Context :

The DOE concerns the process in a reactor in a cocoa factory.

 

Process : We put cocoa in the reactor (like a pressure canner), realize the recipe and then empty the reactor for the next process.

A recipe has 3 steps : Inject Steam, Apply Pressure in the reactor, Inject Air to dry the product. The order of the steps is important and not changed during this DOE.

JuliaL_1-1734361104201.png

 

Aim : See the impact of the steps and optimize the recipe

Answer : pH of the end product

 

We have 3 recipes possible today :

Recipe 1 : No pressure step in the recipe. 0 min at 0 bar

Recipe 2 : Pressure at an intermediate level. 10 min at 1 bar

Recipe 3 : Pressure at an high level. 30 min at 2 bars

If I want to cover the 3 recipes in 1 DOE, I need to have some trials with pressure and some without pressure.

 

Issue :

How to manage linked factors as the pressure ? With a classic DOE I risk to have impossible designs (like 0 min at 2 bars or 30 min at 0 bar)

 

Parameters

Pressure 0 min

Pressure 30 min

Pressure at 0 bar

Possible

Not possible

Pressure at 2 bars

Not possible

Possible

 

What I have already done :

To compensate this issue, I realized two DOE , one with no pressure and the other with pressure step. I put the two DOE tables in one table and then analyze the data with linear regression model.

But I have the feeling that it is not the better way because :

  • My data are not well balanced, I'm afraid to have biais due to the number of trials per "recipe"
  • Maybe the Factor Constraints option could be useful. When I tried Disallowed Combinations Filter , I add pressure as categorical factor (yes/no), it didn't work (maybe I miss something) 

 

Please find attached my data.

 

Best regards,

 

Julia

2 ACCEPTED SOLUTIONS

Accepted Solutions
Victor_G
Super User

Re: How to manage linked factors in DOE ?

Hi @JuliaL,

 

From what I understand, it seems your process involves multiple steps ordered and linked one after another (you can't change factors levels independantly/randomly between several steps ?).

So you seem to be in a Split-Split-Plot design situation, with "very hard" to change factor ("Time of steam injection" in step 1), "hard to change" factors (those in step 2) and "Easy to change" factors (those in step 3, applied after all the ones from step 1 and 2).

 

About the step 2, do you want to only test these 3 recipes or would you like to test other combinations, so that the pressure and time factors could be tested independantly ?

If you have no other options than these 3 recipe, maybe you could merge these factors into a single categorical factor "Pressure recipe" with 3 levels : 

Recipe 1 : No pressure step in the recipe. 0 min at 0 bar

Recipe 2 : Pressure at an intermediate level. 10 min at 1 bar

Recipe 3 : Pressure at an high level. 30 min at 2 bars

 

With these possibilities, I created a Split-Split-Plot design that could match your requirements :

  • Definition of response and factors :

Victor_G_0-1734427793478.png

  • Model specification (Response Surface Model, but you can change it based on your needs) :

Victor_G_1-1734427839380.png

  • Resulting design with 40 runs, 5 whole plots (5 conditions for Step 1), 10 subplots (10 conditions for step 2 factors) :

Victor_G_2-1734427919783.png

This specific split-plot design structure helps you to realize batch experiments for ordered process steps and helps you save more time. If you can vary more the levels of the factor in step 1, you can increase the number of whole plots (you'll have more conditions changes in the design for factor in step 1). If you can vary more the levels of factor in step 2, you can increase the number of subplots (you'll have more conditions changes in the design for factor in step 2, but the subplot should be a multiple number of the whole plot number). Finally, you can also increase the total number of runs depending on your ressources. 
It's a good idea to test several design and use the platform Compare Designs to choose the most relevant and practical one.

 

Please find attached the corresponding datatable for the design.

 

Hope this answer might help you,

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

statman
Super User

Re: How to manage linked factors in DOE ?

I may not completely understand what you are trying to do, but it looks like you are trying to "pick a winner" rather than understand causal structure? I also don't have enough context ( I tried to open the attached file, but JMP "had issues"...can you just attach the JMP file?).  You indicate the response variable is pH.  Is this one measure per batch?  What about within batch variation? How much of a change in pH is of practical value (significance)?  In the brief look at the file, pH didn't change much (7.47-7.58).  How much of that is measurement error?  Also, I don't see any strategies to handle noise (e.g., lot-to-lot variation in raw materials (cocoa), ambient conditions, cleanliness of the reactor, quality of injected air, etc.).

Regarding factors that you describe as linked, there are a number of of approaches that can handle this, here are some ideas:

1. Nested factors - an example is when the levels for one factor are different for the levels of another factor.  Nested is a hierarchical relationship between the factors.  Since one factor (or more) is nested, there can be no interaction effect.  In your case, you might consider time nested in pressure.

2. Separate DOE's.  I think this is what you did.  Nothing wrong with this, especially since you recognize the bias.  Some effects may be inestimable, but that may not matter depending on your objective.  Don't worry so much about the statistics, do the analysis graphically.

3. Sequential approach. Change levels for pressure (and perhaps others).  Use 0.1 (or very small, but not 0) and 2. Remember if you are looking for causal structure, the levels in the experiment do not have to be what you end up with.  The first experiment exposes model effects (and in fact is biased to factor effects, e.g., setting levels bold).  Build your model sequentially following hierarchy (1st order, then 2nd order...) Sequential experiments "fine-tune" levels.

4. As Victor suggests, split-plots are typically very useful for sequential processing.  Particularly efficient if  you can "grab" samples and continue to use the rest of the batch for other treatments.

5. Use disallowed combinations in custom design platform.

 

"All models are wrong, some are useful" G.E.P. Box

View solution in original post

6 REPLIES 6
Victor_G
Super User

Re: How to manage linked factors in DOE ?

Hi @JuliaL,

 

From what I understand, it seems your process involves multiple steps ordered and linked one after another (you can't change factors levels independantly/randomly between several steps ?).

So you seem to be in a Split-Split-Plot design situation, with "very hard" to change factor ("Time of steam injection" in step 1), "hard to change" factors (those in step 2) and "Easy to change" factors (those in step 3, applied after all the ones from step 1 and 2).

 

About the step 2, do you want to only test these 3 recipes or would you like to test other combinations, so that the pressure and time factors could be tested independantly ?

If you have no other options than these 3 recipe, maybe you could merge these factors into a single categorical factor "Pressure recipe" with 3 levels : 

Recipe 1 : No pressure step in the recipe. 0 min at 0 bar

Recipe 2 : Pressure at an intermediate level. 10 min at 1 bar

Recipe 3 : Pressure at an high level. 30 min at 2 bars

 

With these possibilities, I created a Split-Split-Plot design that could match your requirements :

  • Definition of response and factors :

Victor_G_0-1734427793478.png

  • Model specification (Response Surface Model, but you can change it based on your needs) :

Victor_G_1-1734427839380.png

  • Resulting design with 40 runs, 5 whole plots (5 conditions for Step 1), 10 subplots (10 conditions for step 2 factors) :

Victor_G_2-1734427919783.png

This specific split-plot design structure helps you to realize batch experiments for ordered process steps and helps you save more time. If you can vary more the levels of the factor in step 1, you can increase the number of whole plots (you'll have more conditions changes in the design for factor in step 1). If you can vary more the levels of factor in step 2, you can increase the number of subplots (you'll have more conditions changes in the design for factor in step 2, but the subplot should be a multiple number of the whole plot number). Finally, you can also increase the total number of runs depending on your ressources. 
It's a good idea to test several design and use the platform Compare Designs to choose the most relevant and practical one.

 

Please find attached the corresponding datatable for the design.

 

Hope this answer might help you,

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
JuliaL
Level I

Re: How to manage linked factors in DOE ?

Hi @Victor_G ,

 

Many thanks for your fast answer and help ! 

 

From what I understand, it seems your process involves multiple steps ordered and linked one after another (you can't change factors levels independantly/randomly between several steps ?). Yes I have multiple steps ordered (If I realize step 3 before step 2 I will not have the same result at the end ). The factors are not linked with the others except for the factors inside one step.

 

About the step 2, do you want to only test these 3 recipes or would you like to test other combinations, so that the pressure and time factors could be tested independantly ? Unfortunately, I have more than 3 recipes to test in reality + I have more steps in my process + for each recipe, I have different factors settings for each steps. I have also steps with the same issue than step 2. Ideally I would prefer to not have the pressure settings has a categorical data (recipe 1 2 3). For exemple, if I want to find a pH at 7,25 at the end of my process, I would be interested to see a solution with pressure 2 bars and 1 bar and without pressure. By this way, I could adapt the recipe (if the pressure step is impossible for a short time on my equipment, I can still have the expected result)

 

So Split plot design is very interesting (I discover it!) but I'm not sure that it solve my issue. Should I create the split plot design with pressure time and pressure level as "hard to change" factors => put a large number of Whole Plots and  User Specified to generate a lot of trial in the plan => Delete all the impossible combinations for pressure step => Minimize the number of trials to have a realistic DOE ?

 

Julia

statman
Super User

Re: How to manage linked factors in DOE ?

I may not completely understand what you are trying to do, but it looks like you are trying to "pick a winner" rather than understand causal structure? I also don't have enough context ( I tried to open the attached file, but JMP "had issues"...can you just attach the JMP file?).  You indicate the response variable is pH.  Is this one measure per batch?  What about within batch variation? How much of a change in pH is of practical value (significance)?  In the brief look at the file, pH didn't change much (7.47-7.58).  How much of that is measurement error?  Also, I don't see any strategies to handle noise (e.g., lot-to-lot variation in raw materials (cocoa), ambient conditions, cleanliness of the reactor, quality of injected air, etc.).

Regarding factors that you describe as linked, there are a number of of approaches that can handle this, here are some ideas:

1. Nested factors - an example is when the levels for one factor are different for the levels of another factor.  Nested is a hierarchical relationship between the factors.  Since one factor (or more) is nested, there can be no interaction effect.  In your case, you might consider time nested in pressure.

2. Separate DOE's.  I think this is what you did.  Nothing wrong with this, especially since you recognize the bias.  Some effects may be inestimable, but that may not matter depending on your objective.  Don't worry so much about the statistics, do the analysis graphically.

3. Sequential approach. Change levels for pressure (and perhaps others).  Use 0.1 (or very small, but not 0) and 2. Remember if you are looking for causal structure, the levels in the experiment do not have to be what you end up with.  The first experiment exposes model effects (and in fact is biased to factor effects, e.g., setting levels bold).  Build your model sequentially following hierarchy (1st order, then 2nd order...) Sequential experiments "fine-tune" levels.

4. As Victor suggests, split-plots are typically very useful for sequential processing.  Particularly efficient if  you can "grab" samples and continue to use the rest of the batch for other treatments.

5. Use disallowed combinations in custom design platform.

 

"All models are wrong, some are useful" G.E.P. Box
JuliaL
Level I

Re: How to manage linked factors in DOE ?

Hello @statman 

 

Many thanks for your feedback. My answers in purple :

 

I may not completely understand what you are trying to do, but it looks like you are trying to "pick a winner" rather than understand causal structure? There were previous DOE before in order to select the parameters which have an impact on the product. Yes now I try to "pick a winner" as I have a pH range target and others answers range targets. The idea is to find optimized recipes that reached the targets. So the better my prediction model is, the better the product at the end will be. I'm afraid to miss interesting recipes due to the pressure step issue. 

 

I also don't have enough context ( I tried to open the attached file, but JMP "had issues"...can you just attach the JMP file?). I attach the separate DOEs + the two DOE in one table.

 

You indicate the response variable is pH. Is this one measure per batch? What about within batch variation?  The pH is measured at the end of the process and can't be measured in the reactor. We measure other answers , we took several samples within the batch to minimize this variation error . The pH answer is an average of all the samples pH.   

 

How much of a change in pH is of practical value (significance)? We don't want to be out of specifications for pH

 

In the brief look at the file, pH didn't change much (7.47-7.58). How much of that is measurement error? In the file, pH varies from 7,1 to 7,58. Measurement error is 0,1.  

 

Also, I don't see any strategies to handle noise (e.g., lot-to-lot variation in raw materials (cocoa), ambient conditions, cleanliness of the reactor, quality of injected air, etc.). To simplify the problem, I don't put all those factors in the table, because my main issue is the pressure step. But yes you are right. All the factors that could impact the input product and the process itself are registered in my real data table.

 

I will try your different approaches and come back in this discussion to share the results  

 

statman
Super User

Re: How to manage linked factors in DOE ?

Some follow-up. Since you have been iterating, you should already know statistical significance.  By the types of factors you are experimenting on (e.g., pressure or no pressure), I'm curious of previous DOE's.  My suggestion is to develop a contour map of pH (think of it as a topographic map).  Consider where you want to develop the map (region based on factor levels) geometrically and run those treatments.

 

If you are averaging the within batch pH, what is that standard deviation?  

 

Of course, I don't know the specification.  In understanding causal structure, specifications are not useful as they are derived independently.  Specifications, however, do give possible insight into practical significance.  Look at the data for practical significance before statistical.

 

If your measurement error is .1 (1 standard deviation), then you have too much measurement variation given the range of pH in your experiment is 7.1-7.58. (measurement error distribution is greater than the entire range of your data.)

 

"To simplify the problem, I don't put all those factors in the table, because my main issue is the pressure step. But yes you are right. All the factors that could impact the input product and the process itself are registered in my real data table."

 

I can't provide appropriate advice when the situation isn't fully understood.  The question is over what conditions was the experiment conducted.  How were noise variables handled during the experiment?  They were either:

1. Held constant - this is a really bad idea as it restricts the inference space and therefore reduces the likelihood your model will work in the future.

The exact standardization of experimental conditions, which is often thoughtlessly advocated as a panacea, always carries with it the real disadvantage that a highly standardized experiment supplies direct information only in respect to the narrow range of conditions achieved by the standardization.  Standardization, therefore, weakens rather than strengthens our ground for inferring a like result, when, as is invariably the case in practice, these conditions are somewhat varied

R. A. Fisher (1935), Design of Experiments (p.99-100)

2. Allow to vary randomly during the experiment - better, but still not optimal as this reduces the precision of the experiment

3. Partitioned (e.g., blocking) - Better yet as this can increase the inference space while increasing the precision of the experiment.

 

 

 

 

"All models are wrong, some are useful" G.E.P. Box
JuliaL
Level I

Re: How to manage linked factors in DOE ?

To complete the topic :

 

After several design creation tests, the best was sequential approach with factor level as small as possible (as I have a large number of factors). Separate DOEs are easier to manage when I really need to have the factors at 0 (depending on the equipment). Many thanks for your help, it will help me in the future.

 

To answer last questions, the within batch variation is 0,02. The trials are randomized and replicated for center points (it is not perfect). I will follow the advice for contour map.

 

Wishing you an happy Christmas break.