cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

(Screening) DOE choice when response is non-linear

I have a slightly more conceptual question regarding DOE choice (or another experimental framework) when the effect of factors is non-linear. My experiment has many factors >10, which I reduced to 6, and ran a definitive screening design. Per the results (see image below), there is no clear linear relationship (or even quadratic) between the factors and outputs. Sure a few look like they can be, but generally it doesn’t look like so (at least per my analysis).

 

Screenshot 2024-06-14 at 16.31.50.png

 

Based on this, my question is if the underlying response is non-linear does it even make sense to use/design a fractional factorial or definitive screening design (DSD), which can only go up to quadratic if I am not wrong, for the purpose of screening? If the goal is to screen, what other alternatives are there in the DOE toolbox that you recommend I should consider? 

 

One solution I thought of was to decrease the range of each of the factors: the idea behind being that if you decrease the range small enough, then any underlying function/response will be linear. However, I might loose the point of the screening experiment, and I might have to conduct many screening experiments at different ranges. Is this what is done in practice with non-linear response? Consider the following graph I read in a paper. If the underlying response looks like this, I am not sure which DOE methods (initial screening + response surface?) can get me to something like this. What would be your recommendation on methods to find the global optimal in these cases?

 

Screenshot 2024-06-14 at 16.36.47.png

2 REPLIES 2
statman
Super User

Re: (Screening) DOE choice when response is non-linear

Regarding your analysis of the DSD you performed, I can't just use those plots to conclude anything.  There are a number of analysis outputs that need to be evaluated that are not provided.  For example, residuals analysis.  It looks like you may have some unusual data points in your experiment.  Also, how did you go from >10 to 6?  Where did you set the factors you excluded?  What is your inference space and what was the noise that changed during the experiment and what noise was held constant during the experiment?  Was your measurement system evaluated before or during the experiment?

 

There is no way to provide advice that works in all situations.  What is required is you perform situation diagnostics based on the generally accepted criteria for design selection.  Part of this set of criteria is predicting the rank order of model effects up to 2nd order.  Based on your predictions, choose appropriate resolution (for 2nd order factorial) and polynomial (quadratic).

 

Advice, in general, is you build models hierarchically.  Following Taylor series, start with 1st order and add order through iteration. Much depends on whether the optimum is truly inside the design space or outside the design space.  If the optimum is truly outside the design space, it is preferred to explore the linear relationships first so as to find factors that can move you through the response surface space fastest (fastest between 2 points is a straight line).  Once you are near the optimum area, then augment the design space.  Complex relationships may exist throughout the space, but are not useful until you get near the optimum.  The more complex the model, the less universally useful.

 

My thoughts on global optimum vs. local maxima...you want your global to be plateau-ish (that is robust).  Locals tend to be peeked/pointed and not robust.  I suggest you read Box's discussions on sequential experimentation to get an understanding of his approach to handle model building and complex relationships.

 

A good model is an approximation, preferably easy to use, that captures the essential features of the studied phenomenon and produces procedures that are robust to likely deviations from assumptions

                                                                                               G.E.P. Box

 

BTW, Bill Diamond (Practical Experiment Designs) also suggests reducing the spacing between levels to counter possible non-linear relationships.  However, I don't think this is for screening designs. IMHO, the biggest concern when just beginning the investigation is the Beta error.  That is, dropping potentially useful factors because there is not evidence to suggest the factor has an effect.  Setting factor levels bold, but reasonable, is good advice to minimize this error.  In addition, since you hope to mitigate bias in selecting factors for further study, set all factors in the investigation bold.

"All models are wrong, some are useful" G.E.P. Box

Re: (Screening) DOE choice when response is non-linear

You might investigate the possibility of a highly non-linear response by using a different model type with the data in hand. A neural network model does well with high dimensionality and non-linearity. A Gaussian process model is also a good choice, though it tends to perform better with a space-filling design.