cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
frankderuyck
Level VI

Mixture DOE with covariate batch effects

I have to generate a DOE with 7 mixture effects and 6 continuous covariate factors. The covariate factors are measurements of raw material batches, only 20 bathes are available. Mixture and quadratic covariariate effects as well as mixture -covariate interactions need to be estimate using a Kowalski with only 2nd order interactions. The number of runs exceeds #batches so repeated sampling from each batch is necessary. Batch effect is uncontrolled and random. How to set up this DOE?  I tried to make a split plot with covariates hard to change so whole plot can be assigned as random however when making the DOE it fails to converge? Do I need to generate a large covariate table with multiple repetitions of the 20 batches?.. Thanks a lot for advice.

18 REPLIES 18
Victor_G
Super User

Re: Mixture DOE with covariate batch effects

Hi @frankderuyck,

 

I finally had the chance to read the paper and try to reproduce the same design (or at least having a similar design using the same methodology), following these steps:

 

  1. Start by creating a table with 40 batches and 6 continuous numerical properties measured on these batches (file "Batch_Covariate_table"). These 6 properties will be used as covariates in the design.
  2. Use the Custom Design platform, enter the mixture factors and ranges, add the covariate factors from the previous table and change their "Changes" property as "Hard" as mentioned in the paper to have a split-plot design structure type.
  3. Change the optimality criterion to "I-Optimal" and specify the number of starts (10 instead of 1000 in the publication to save some time, but careful if you want to generate it again, design generation lasted approximately 1h on my computer (around 5min/random start) !).
  4. Enter the linear constraints as mentioned in the paper :
    {-0.6 * :Potato flakes + 0.4 * :Wheat starch + 0.4 * :Parboiled rice flour + 0.4 *
    :Extruded rice flour + 0.4 * :Corn flour <= 0, -0.3 * :Potato flakes + 0.7 *
    :Wheat starch + -0.3 * :Parboiled rice flour + -0.3 * :Extruded rice flour + -0.3 *
    :Corn flour <= 0, -0.5 * :Potato flakes + -0.5 * :Wheat starch + 0.5 *
    :Parboiled rice flour + -0.5 * :Extruded rice flour + -0.5 * :Corn flour <= 0, -0.5
     * :Potato flakes + -0.5 * :Wheat starch + -0.5 * :Parboiled rice flour + 0.5 *
    :Extruded rice flour + -0.5 * :Corn flour <= 0, -0.5 * :Potato flakes + -0.5 *
    :Wheat starch + -0.5 * :Parboiled rice flour + -0.5 * :Extruded rice flour + 0.5 *
    :Corn flour <= 0}
  5. Specify the model, with mixture main effects, 2-mixture factors interactions, quadratic non-mixture effects, and 2-non-mixture factors interactions (estimability set as "If Possible" for these last terms).
  6. Set the number of whole plots to 40 (same number as the number of batches/covariates runs available) and number of runs to 256.
  7. Make design !

 

As I wasn't able to had access tu supplementary materials, I'm not sure this is the same design they obtained. If I have taken into accounts all their requirements and specifications of the design, it should however be similar to the one they obtained. If you have access to the 256-runs table (and supplementary materials), you can compare the designs.
I would be interested to have it as well to check I didn't forget something when setting up the DoE

From what I understand in the design, the goal with the covariates here is to determine fixed effects of batch properties on the responses (fixed effects terms), as well as the influence of batch variability on the responses variance (whole plot random effect, and consideration of the 40 batches as a representative sample from a larger population).
Interesting design !

 

I hope this complementary answer will help you,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
frankderuyck
Level VI

Re: Mixture DOE with covariate batch effects

Hi Victor, great job, wish I coud give 10 Kudo's! I have no acces to the original DOE but I'm convinced it will be similar.

Indeed goal is to find interaction effects between raw material properties and mixture effects so robust settings might be possible or raw materials may have to be more controlled.

One remark on this paper: 256 runs is a lot.. I wonder if it is not possible with less runs and what minimum #runs are required to still achieve a useful and reliable model.

Thanks a lot for your great help!!

Frank

Victor_G
Super User

Re: Mixture DOE with covariate batch effects

Hi Frank,

I'm glad this design could work for you. I would still recommend some caution about the design generated, as in the absence of the original design from the paper, I'm not able to compare it or see if I missed something during its generation.

Regarding your concern about the high number of runs of this design, I see two options (that you can combine):

  • Reduce the number of batches in the design (to reduce the number of whole plots) : as described previously, you could start on your batch table by creating a small D-Optimal custom design on covariates (with only main effects in the model), in order to only keep the 7-10 most dissimilar/representative batches. This could heavily reduce the required number of runs, but the 2-factors non-mixture interactions coefficients as well as quadratic non-mixture coefficients may likely be less precisely estimated.
  • Simplify the model and/or set some higher order model terms estimability to "If Possible" : This could save some runs as well, but at the cost of a more simple model that may not accurately describe the system.

 

You can still calculate the required minimum of runs based on the number of factors (and terms in the assumed model) in your project to estimate if you should simplify your problem (remove some factors, do a screening first, etc...) or if the required number of runs is acceptable.

 

Hope this complementary answer may help you,

Best,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
frankderuyck
Level VI

Re: Mixture DOE with covariate batch effects

Hi Victor, 

I agree, better use a sequential approach with a limited # batches based on a covariate selection DOE and "if possible" (a powerful) screening of higher order mixture effects. One can reserve a number of batches as a test set for validation of the model that can be augmented to a small space filling or special Scheffé mixture DOE.

Kind regards! Frank

frankderuyck
Level VI

Re: Mixture DOE with covariate batch effects

I still have one question: on page 4 of the patato crisp paper Peter assigns a random block effect to the 40 batches?

However the batches are correlated with fixed hard to change covariate effects so I assume that when using a mixed model analysis assigning batch as random REML will not estimate the covariate effects? 

Victor_G
Super User

Re: Mixture DOE with covariate batch effects

Hi @frankderuyck,

 

Yes, a random block effect is assigned due to the "hard-to-change" covariate factors which create a split-plot design structure type.

However, using a mixed model analysis does enable to estimate quadratic covariate effects, as well as 2-factors covariate interactions. You can try it using the datatable provided and creating a random response.
The "hard-to-change" covariate factors create a split-plot structure, which imply a restriction on randomization, but you're still able to estimate the assumed covariate terms estimates, at a lower precision than "easy-to-change" factors would have enabled.

 

Hope this answer will clarify,

Victor GUILLER
L'Oréal Data & Analytics

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)
statman
Super User

Re: Mixture DOE with covariate batch effects

This makes no sense to me, so perhaps you can explain your terminology?

"each batch being characterized by several covariate factors".  Why are you using the word covariate here?  I typically think of covariates as independent factors that you cannot directly control (i.e., noise), but can be measured.  Covariates are one of the many ways to handle noise in a designed experiment.  Because they are assigned in the model as a random effect, they are accounted for and their influence on statistical tests is minimized (increasing the precision of the design).

 

I assume you have no ability to control the process making batches.  Is this a supplied material?  Are these factors input variables with respect to the batch "quality"?  How do you know the relationship between the factors and the measures of the batch "quality"? Can you quantify the batch quality in a meaningful way (without using all 6 input factors)?  A single measure of the batch quality may be useful as a covariate in an experiment.

"All models are wrong, some are useful" G.E.P. Box
statman
Super User

Re: Mixture DOE with covariate batch effects

Here are my thoughts: Admittedly, I do not understand your situation sufficiently to provide specific advice.

I believe a sequential approach would be more efficient than one BIG experiment.  

I'm not sure I understand the use of the term covariate for your example, nor what you would actually do to your process if you have covariate by mixture factor interactions (other than you can say your process is not robust).  It is a challenge to account for 1 covariate in a typical experiment.  Accounting for 6 is likely not very efficient or effective. Repeated sampling from the same batch does not increase the degrees of freedom associated with an experiment whose experimental unit is defined as a batch.

 

I would start with directed sampling to understand the batch variation components.  It seems you have at least 3 components of variation: Measurement, Within batch and Batch to batch.  You should want to understand the relative size and the consistency of these components before planning your experiment. Apparently you have 6 measures (response variables or Y's) of the batch that provide insight to the batch variation? Do these correlate?  Do you want to understand why there is within and between batches variation?  How confident are you in measurement systems?

 

Mixture designs are optimization designs.  They are not really intended for screening.  I think you are in need of screening before optimization.

"All models are wrong, some are useful" G.E.P. Box
frankderuyck
Level VI

Re: Mixture DOE with covariate batch effects

Study of the raw material variance is difficult as batches are supplied by external vendors. I agree that - a discussed above - a pre-selection is recomended using a covariatie DOE that will yield the best selection.