Solved: Augmenting experiments with a DOE, effect on future experiments

dbu · Jun 8, 2023 2:12 PM

Hello,

I understand that when you enter parameters for an experimental space, a DOE is created to most efficiently explore that space. My question is regarding the effect of previous data on the DOE.

If your previous data has already explored part of the parameter space, will the DOE take this data into consideration to most efficiently explore the rest of the parameter space? Or will the DOE just use the data for fitting, but explore the space as if nothing was known yet?

Thanks for your time

Victor_G · Oct 19, 2022 8:51 AM

Hi @dbu,

Short answer : Yes, the "Augment Design" platform takes into account previous experiments and terms in the model design to suggest new runs that may increase the level of information.

Long answer : The results of the "Augment Design" platform will depend on what you're trying to achieve and the modifications you're doing :

If you reduce the experimental space, the new points will be generated in this smaller space (see screenshot "Reduction of experimental space" done with space-filling augment design platform).
If you increase the experimental space, most of the runs will be allowed in this space than in the original one (see screenshot "Augmentation of experimental space" done with space-filling augment design platform).

Note that if you check "Group new runs into separate block", JMP will still do some runs in the previous experimental space (unless it is not included in the factors ranges at all) in order to check if this second round of experiments has a similar variance than the first one.

Also the type of augmentation will create new runs differently, depending if you're adding more terms in the model (through the option "Augment"), or if you replicate, add centerpoints/axial or fold over your design, or if you're doing space filling (for this last point, you can see the screenshot done on simulated data "Space-Filling augmentation", which is quite illustrative, or the previous screenshot for augmentation/reduction of the experimental space). More infos here : Additional Examples of Augmentation Choices (jmp.com)

I have done a study on this augmentation on a concrete use case (from the former company I have worked for) that you can find here : Réparation d'un plan d'expériences incomplet par augmentation - Réunion du group... - JMP User Commu... Sorry that it's in french, but you will see the point made and the advantage of augmentation quite easily on the use case : it enables us to use efficiently previous lab data (supposed to be from a DoE, but it wasn't really the case) and augment the initial set of experiments to have a correct DoE with less new experiments needed that if we had to start from scratch. From this use case, you can look at the screenshot done on 3 factors with the scatterplot 3D to see where the new runs are in the experimental space (in green) compared to initial runs (in blue).

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

Victor_G · Oct 19, 2022 8:51 AM

Hi @dbu,

Short answer : Yes, the "Augment Design" platform takes into account previous experiments and terms in the model design to suggest new runs that may increase the level of information.

Long answer : The results of the "Augment Design" platform will depend on what you're trying to achieve and the modifications you're doing :

If you reduce the experimental space, the new points will be generated in this smaller space (see screenshot "Reduction of experimental space" done with space-filling augment design platform).
If you increase the experimental space, most of the runs will be allowed in this space than in the original one (see screenshot "Augmentation of experimental space" done with space-filling augment design platform).

Note that if you check "Group new runs into separate block", JMP will still do some runs in the previous experimental space (unless it is not included in the factors ranges at all) in order to check if this second round of experiments has a similar variance than the first one.

Also the type of augmentation will create new runs differently, depending if you're adding more terms in the model (through the option "Augment"), or if you replicate, add centerpoints/axial or fold over your design, or if you're doing space filling (for this last point, you can see the screenshot done on simulated data "Space-Filling augmentation", which is quite illustrative, or the previous screenshot for augmentation/reduction of the experimental space). More infos here : Additional Examples of Augmentation Choices (jmp.com)

I have done a study on this augmentation on a concrete use case (from the former company I have worked for) that you can find here : Réparation d'un plan d'expériences incomplet par augmentation - Réunion du group... - JMP User Commu... Sorry that it's in french, but you will see the point made and the advantage of augmentation quite easily on the use case : it enables us to use efficiently previous lab data (supposed to be from a DoE, but it wasn't really the case) and augment the initial set of experiments to have a correct DoE with less new experiments needed that if we had to start from scratch. From this use case, you can look at the screenshot done on 3 factors with the scatterplot 3D to see where the new runs are in the experimental space (in green) compared to initial runs (in blue).

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

dbu · Oct 25, 2022 12:52 PM

Thank you for your help!

statman · Oct 25, 2022 01:34 PM

I have an alternative view. It sounds as though you believe the experimental space is a constant and all you do is change how you sample that space. Perhaps that would be nice, but unfortunately totally unrealistic. What percentage of industrial experiments actually repeat? If you ran the same design structure again (over changing noise conditions) would you get the same results? Why or why not? One reason they don't is due in large part to the fact that we don't start investigations with ALL variables in the study. We start with a subset, often a small subset. It is kind of ironic that we tend to include only those factors that we have hypotheses to support have an effect. What do we do with the others? Do we ever get data to support those choices? Experimental design must evaluate both sources of variation. Those that we choose as a result of our hypotheses (the factors in an experiment) and those that we don't (the noise). Optimization design strategies are only effective if the underlying system is stable and consistent. Perhaps what your next iteration should be is to determine stability or your model over changing conditions rather than creating a complex model that only works yesterday.

"All models are wrong, some are useful" G.E.P. Box

dbu · Oct 26, 2022 12:45 PM

Hi statman (reminds me of the song "Scatman"),

Thanks for your insight on our parameter choices - and I totally agree that we are not taking every variable into consideration. Perhaps I should share that as a preliminary study to our current state, we did perform a screening study in an attempt to use data to choose the most "important" variables. Of course, even our screening test did not include every variable, and I will admit that we pared down a few to make the number of screening test experiments more feasible (as you said, guided by our own hypotheses).

Could you perhaps point me in the direction of how to evaluate model stability? This is would be really valuable for us to understand and be able to quantify.

- David

statman · Oct 26, 2022 03:22 PM

David,

I've not heard that song. My barber gave me my nickname when I described the line of work I was in. It just so happens it was about the time the movie Batman was released...

Stability and consistency are best evaluated using sampling and control chart method. Essentially set you process up per the model and run it over changing conditions (time as a surrogate). Evaluate range or moving range charts to assess consistency.

There are other statistical procedures you can do like for example holding out data when the model is first being created and then testing the model against those data points that were held out (boosted tress, bootstrap forest, nearest neighbors and such). Those still rely on the inference space from which the original data was gotten, not future conditions. You might also check out the process screening platform as well, and this uses control chart method.

https://www.jmp.com/support/help/en/17.0/?os=mac&source=application#page/jmp/the-process-screening-r...

"All models are wrong, some are useful" G.E.P. Box

Augmenting experiments with a DOE, effect on future experiments

Re: Augmenting experiments with a DOE, effect on future experiments

Re: Augmenting experiments with a DOE, effect on future experiments

Re: Augmenting experiments with a DOE, effect on future experiments

Re: Augmenting experiments with a DOE, effect on future experiments

Re: Augmenting experiments with a DOE, effect on future experiments

Re: Augmenting experiments with a DOE, effect on future experiments