cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
ELH
ELH
Level III

definitive screening design

Dear All,
I am planning to build an RSM to optimize my method. I have 8 continuous factors, 1 categorical factor, and 2 responses. The levels of each factor were determined according to previous experiences of the team. I have no idea if there are interactions between factors or not. I can not identify the source of noise and I am planning to randomize. After I did some research, I think the most suitable design to choose would be a definitive screening design?
I will appreciate any remarks or advisees
Thank you

5 REPLIES 5
statman
Super User

Re: definitive screening design

Said,

 

I assume you are trying to understand a response surface (RSM is a methodology) describing two dependent variables.  There are many approaches to doing this.  I am biased to the sequential approach taught to me by Dr. G.E.P. Box.  This is a scientific method approach, where you start with hypotheses (represented by factors in the experiment) and use data to provide insight to those hypotheses. There is no "right" way to do this and the most efficient and effective way is situation dependent. This approach typically starts with fractional 2-level designs for screening a large number of variables (based on the principles of Scarcity and Hierarchy of effects). Then moving the space in the direction of improvement and reducing the number of variables.  As your work iterates, you start adding higher order terms to the model.  These higher order effects may be factorial (interactions) or polynomial (curvature).  The sequential work is quite helpful in understanding the surface.  This type of work is best managed by the scientist or engineer that understands the mechanisms (e.g., physical, chemical) and can interpret the results of a statistical study.  It is the scientist/engineer that should explain (hypotheses) and predict the possible effects of factors (and whether the effect is linear in the space being investigated), interactions and extremely important, the NOISE.  Perhaps you should start by understanding the first order model and the impact of noise prior to investigating non-linear relationships, but this is a function of what you do and do not know. My advice based on your limited discussion of your situation, is to spend some time identifying what the noise might be and then developing a strategy to handle that noise. I also think blocking (or: repeats, split-plots, covariates, etc.) is a better strategy than randomizing.  Why, because you want to understand the impact of the noise, preferably assign it, not just get a numerical estimate of the size of it.  While randomizing will, hopefully, give you an un-biased estimate of the noise, it can compromise the precision of the design.

"Block what you can randomize what you cannot", G.E.P. Box

I'm not saying don't use definitive screening designs (Brad Jones has done an excellent  job bringing this design strategy to the community), you just need to weigh what information you need to get at this point in time with the resources to get that information.

Good luck on your journey.

"All models are wrong, some are useful" G.E.P. Box
ELH
ELH
Level III

Re: definitive screening design

Thank you @statman for the information you provided

"My advice based on the limited discussion of your situation" Please let me know which details you need and I will be happy to share. Then I think you will have better visibility to give me more advice which is highly appreciated 
best regards 

 

Re: definitive screening design

I will add just two comments to @statman,.

 

First, it is a screening design, so it is intended to be used in a screening situation in which the key screening principles hold. One of the most important that I suspect does not hold in your case is the sparsity of effects. This kind of design emphasizes economy over capability. The assumption is that few of the factors are, in fact, active. Do you expect less than half of the 8 factors to be inactive?

 

Second, screening or not, be careful that the "levels of each factor were determined according to previous experiences of the team" are not too narrow. The ranges should be determined based the need to provoke a large change in the response so that the regression can efficiently estimate the model parameters (small standard errors). The ranges should always be wide. They should not be selected based on the expectation that the response will always achieve a desirable level. This is an experiment, not a test. The experiment should provide the best data to support the modeling, not to support 'pick the winner.'

ELH
ELH
Level III

Re: definitive screening design

Thank you @Mark_Bailey 

back to your remarks, I do not have any idea if more than half of my factors are inactive or not! I agree that the DSD is at the end a screening design but it gives the possibility to see the curvatures in my response and to build a response surface model. From my readings, analyzing the definitive screening can be done using Forward Stepwise Regression in which the effect sparsity should hold. On the other hand, if I want to analyze a Definitive screening design and in case many effects are likely to be active (more than half of my factors) I should use the Fit Definitive Screening Platform which takes into consideration the advantageous structure of DSD (adding extra runs and analyzing the model in 2 stages)
What do think?

Re: definitive screening design

I think that you cannot expect to fit the unbiased model based on data from a DSD if the principle of sparsity of effects does not hold. The Fit Definitive Screening method does not escape from the assumption of this principle..

 

At the very least, you should add runs and use more than the minimum number of runs, and use more than the default of 4 additional runs to cover the potential number of effects in the model. How many additional runs? I don't know. The addition of runs in the DSD is usually based on a consideration of power, a key characteristic in selecting a design for screening.

 

An alternative approach is a custom design where you can change the Estimability of each term as necessary and add 3-4 runs for each If Possible term.

 

The economy of a DSD, like any screening design, is attractive, but the small design will only satisfy the model if it has a small number of terms.