Discussions

ishwarvenugopal · May 9, 2023 04:51 AM

Hi,

I am looking for the best way to design a DoE table for a case like the example below (if that is possible!):

If my input factors are categorical variables with two levels (e.g. present, not present), and imagine I have some maybe 20 input factors and one numeric response variable. Generally, using a custom design I can generate a DoE with one table which will have 20+1 columns and each row will be a combination of those 20 inputs (something like below).

X1	X2	X3…	X20	Response

Is there a way that I could force JMP to generate 3 different tables, each having only a subset of the inputs and still be able to able to calculate the effects of inputs present in different tables on the response variable once the experiments are completed? For example, what I mean by that is instead of the table above, can I force JMP to create a table having only inputs X1-X6, another one say X4-X15 and a third one X9-X20 like below:

X1	X2…	X6	Response

X4	X5…	X15	Response

X9	X10…	X20	Response

and still be able to understand what the response would have been if X1 and X20 were the only inputs from the model that it fits? I had been trying to find something similar in the documentation, but without any luck. It might not be possible this way, but any help is appreciated. Thanks!

statman · May 9, 2023 09:55 AM

Quick response....doing that loses the ability to assess interactions across the multiple experiments (all of the factors). You will only get interactions inside the subset of factors. However, 20 factors is a lot. Perhaps you only care about main effects initially, then you have several options for different fractional designs. The concern is about the design space which is a function of the factors and their levels. As you change factors, you change design space. Leaving some out and adding others later can create completely different spaces that cannot be compared. It requires SME to determine if those design spaces are useful.

"All models are wrong, some are useful" G.E.P. Box

Mark_Bailey · May 9, 2023 01:32 PM

What is the reason for splitting twenty factors across three data tables? Why is a single data table with the design for all twenty factors unsatisfactory? What is the basis for allocating a factor to one of the three data tables?

@statman suggests an economical factor screening design to start. You could use a custom design with a default sample size of 28 runs, a definitive screening design with a default size of 50 runs, a Plackett-Burman design with a default sample size of 24 or 28 runs, or a regular fractional factorial design of 32 runs.

It is difficult for us to give you more help without more background information.

ishwarvenugopal · May 10, 2023 1:28 AM

Thanks for your inputs.

@Mark_Bailey: The reason for splitting into subsets is purely due to practical constraints put forward by a piece of equipment we use, which can work faster if the set of runs for the experiment have ingredients of similar nature. A single table also works, but is less efficient as it would require extra effort for runs with ingredients which are very different from the other ones. So logically, we would imagine that if we could split the table taking only a particular subset of ingredients (choice of which is largely dependent on the people who have some prior knowledge about the ingredients), it might be a more efficient way to utilize the equipment. Hope that made sense.

statman · May 10, 2023 10:23 AM

Unfortunately, there is just not enough context to provide appropriate advice. If not all of "the ingredients" can be used together (something about similar nature?), then why would they be in the same experiment? There are options to handle "difficult to change factors" to make experiments more efficient.

"All models are wrong, some are useful" G.E.P. Box

Phil_Kay · May 12, 2023 9:09 AM

I agree with earlier replies that it is difficult to understand why you would want to do this. Let me know if I am wrong, but it sounds like you want to conduct 3 separate experiments with different subsets of 20 ingredients. There is some limited overlap in ingredients used in the experiments. Then you want to combine the results into a single model with all 20 ingredients as factors.

I think you need to decide between executing and modelling 3 separate experiments. Or executing and modelling 1 experiment with all 20 ingredients. I don't think you can have it both ways.

Creating a grand model from the 3 separate experiments combined is problematic because you will have only explored a few very small "corners" of the total possibility space of the 20 factors. It is effectively a highly constrained experiment. I struggle to see how you could build a useful model.

I suspect (you will know better than anyone else) that your scientific objectives will be better met by executing the 3 separate experiments.

Discussions

Splitting DoE table based on subsets of inputs

Re: Splitting DoE table based on subsets of inputs

Re: Splitting DoE table based on subsets of inputs

Re: Splitting DoE table based on subsets of inputs

Re: Splitting DoE table based on subsets of inputs

Re: Splitting DoE table based on subsets of inputs

Recommended Articles