cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Splitting DoE table based on subsets of inputs

Hi,

 

I am looking for the best way to design a DoE table for a case like the example below (if that is possible!):

 

If my input factors are categorical variables with two levels (e.g. present, not present), and imagine I have some maybe 20 input factors and one numeric response variable. Generally, using a custom design I can generate a DoE with one table which will have 20+1 columns and each row will be a combination of those 20 inputs (something like below). 

 

X1

X2

X3…

X20

Response

 

 

 

 

 

 

 

 

 

 

 

Is there a way that I could force JMP to generate 3 different tables, each having only a subset of the inputs and still be able to able to calculate the effects of inputs present in different tables on the response variable once the experiments are completed? For example, what I mean by that is instead of the table above, can I force JMP to create a table having only inputs X1-X6, another one say X4-X15 and a third one X9-X20 like below:

 

X1

X2…

X6

Response

 

 

 

 

 

 

 

 

 

X4

X5…

X15

Response

 

 

 

 

 

 

 

 

 

X9

X10…

X20

Response

 

 

 

 

 

 

 

 


and still be able to understand what the response would have been if X1 and X20 were the only inputs from the model that it fits? I had been trying to find something similar in the documentation, but without any luck. It might not be possible this way, but any help is appreciated. Thanks!

5 REPLIES 5
statman
Super User

Re: Splitting DoE table based on subsets of inputs

Quick response....doing that loses the ability to assess interactions across the multiple experiments (all of the factors).  You will only get interactions inside the subset of factors.  However, 20 factors is a lot.  Perhaps you only care about main effects initially, then you have several options for different fractional designs.  The concern is about the design space which is a function of the factors and their levels.  As you change factors, you change design space.  Leaving some out and adding others later can create completely different spaces that cannot be compared.  It requires SME to determine if those design spaces are useful.

"All models are wrong, some are useful" G.E.P. Box

Re: Splitting DoE table based on subsets of inputs

What is the reason for splitting twenty factors across three data tables? Why is a single data table with the design for all twenty factors unsatisfactory? What is the basis for allocating a factor to one of the three data tables?

@statman suggests an economical factor screening design to start. You could use a custom design with a default sample size of 28 runs, a definitive screening design with a default size of 50 runs, a Plackett-Burman design with a default sample size of 24 or 28 runs, or a regular fractional factorial design of 32 runs.

It is difficult for us to give you more help without more background information.

Re: Splitting DoE table based on subsets of inputs

Thanks for your inputs.

 

@Mark_Bailey: The reason for splitting into subsets is purely due to practical constraints put forward by a piece of equipment we use, which can work faster if the set of runs for the experiment have ingredients of similar nature. A single table also works, but is less efficient as it would require extra effort for runs with ingredients which are very different from the other ones. So logically, we would imagine that if we could split the table taking only a particular subset of ingredients (choice of which is largely dependent on the people who have some prior knowledge about the ingredients), it might be a more efficient way to utilize the equipment. Hope that made sense.

statman
Super User

Re: Splitting DoE table based on subsets of inputs

Unfortunately, there is just not enough context to provide appropriate advice.  If not all of "the ingredients" can be used together (something about similar nature?), then why would they be in the same experiment? There are options to handle "difficult to change factors" to make experiments more efficient.

"All models are wrong, some are useful" G.E.P. Box
Phil_Kay
Staff

Re: Splitting DoE table based on subsets of inputs

I agree with earlier replies that it is difficult to understand why you would want to do this. Let me know if I am wrong, but it sounds like you want to conduct 3 separate experiments with different subsets of 20 ingredients. There is some limited overlap in ingredients used in the experiments. Then you want to combine the results into a single model with all 20 ingredients as factors.

 

I think you need to decide between executing and modelling 3 separate experiments. Or executing and modelling 1 experiment with all 20 ingredients. I don't think you can have it both ways.

 

Creating a grand model from the 3 separate experiments combined is problematic because you will have only explored a few very small "corners" of the total possibility space of the 20 factors. It is effectively a highly constrained experiment. I struggle to see how you could build a useful model.

 

I suspect (you will know better than anyone else) that your scientific objectives will be better met by executing the 3 separate experiments.