cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
jmpkat
Level I

Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

To show robustness of a chemical process, we often need to show regulators that selected parameters (temp, equivalents, time and reagent ratios) are not impacting the quality (purity) of final product.  In order to show this, regulators prefer performing screening designs.  Usually, we want to show that the parameters do not have detrimental effect within the ranges studied, so that the process is robust.  

 

My question is if the experimenter performs with utmost precision, and the data shows that none of the factors (or interaction terms) are influencing quality, then that's fine.  If the experimenter collected data that is not good and the model shows that none of the factors are influential, then how to figure out if we can trust the data and the model?  Is there anyway, especially if we don't have replicate runs to see variation? 

 

I see that Definitive Screening Designs do not necessarily include replicate experiments, then how to know if there is significant variability in the responses?

10 REPLIES 10
P_Bartell
Level VIII

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

 I'll take a stab at some thoughts.

 

First: If the variation in the responses is swamping any signal that might be hidden from factor variation in my mind there is one of three causes:

 

1. Excessive measurement system variation. In this scenario, the main cause of response variation is predominantly measurement system variation. Do you have an ongoing process control system in place for the measurement system? Do you know the inherent 'normal' variation of the measurement system? If not, you'll never know if your conclusion of 'no effect' due to factor variation is real or just an illusion caused by excessive measurement system variation.

 

2. Noise/nuisance variables influencing the experimental process in a way that increases response variation. It's important to be ever vigilant for these types of influences during experimental conduct. This is a tricky one because you can never know for sure...but a watchful eye, perhaps by an independent third party, during experimental conduct and planning can be helpful.

 

3. There really is nothing going on...

 

Overall, your question is at the heart of hypothesis testing, we can never 'accept' a null hypothesis that 'nothing is happening wrt to the factors'...only fail to reject it. In the case you've described, all you can really say is we fail to reject the null hypothesis that each estimable effect is equal to zero. 

jmpkat
Level I

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

Thank you for your reply.  Let me clarify somethings.  

 

Let us say factors A, B, C have potential to impact purity of a product in a chemical reaction, but only when tested at extremes.  There will be numerous other factors (let us call them X's) that have very marginal effect on purity.  These other factors are not included in the study because either they have only negligible impact on the purity or high confidence that they can be well controlled precisely at the set point as the process goes to plant.  

 

Now the goal of the study is to do experiments in the lab (on small scales mimicking large scale operation as much as possible), systematically varying factors A, B and C while keeping all other noise factors constant.  

 

In a hypothetical situation where the chemist who is running these reactions did not perform the experiments meticulously, meaning not maintaining A, B and C factors at the points they need to be for the study and not controlling noise factors at the designated values, then there could be a situation where the combination of noise factors blur the actual signal coming out of systematic variation of factors A, B and C.  In that situation, the results may indicate that factors do not impact purity within the ranges tested.  

 

Is there a way to say the results are not reliable because the experiments were not performed as intended?  Especially if there are no replicate experiments in the matrix?  

 

  

P_Bartell
Level VIII

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

The only technique I can think of is you might be able to use for the scenario you describe is by examining the empirical results to what is expected IF the experimental trials were run as prescribed. Obviously there's going to have to be a difference between the expected responses and observed that is big enough to arouse suspicion...but that's my only idea for you.

 

A potentially helpful informative technique might be to use the JMP Prediction Profiler that contains a model you believe...then for given combinations of the 'new' experiment, ask 'how far away from the predicted response is the actual response?'...using the prediction interval as an informative criteria.

 

The scenario you described actually happened to me once on a project with a team of skeptics of DOE in general...I was the statistician on the team and many on the team felt DOE was akin to black magic. We ran a confirmation trial based on our chosen optimal x's. The results were not even close for virtually all of our responses. So the team went into our best Sherlock Holmes mode to try to find 'what went wrong?' Turns out the operator had intentionally chosen one of the key x factors to be 'what we always use...not this ridiculous level written on the sheet'. Shame on us for not keeping the operator in the loop...we wanted that 'ridiculous level' to confirm our process knowledge. So once we discovered what the operator had actually chosen, I trotted out the JMP Prediction Profiler and entered the inappropriate level for the x factor...lo and behold the Profiler predicted with remarkable accuracy the lousy results we observed. By the end of that meeting even the DOE skeptics were won over.

P_Bartell
Level VIII

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

Here's a second idea for you that doesn't help DETECT the root cause of 'bad data'...but PREVENT it...borne of personal experience. After a few years of DOE consulting for numerous projects at my employer (mostly in a new process/product development space) it dawned on me that far too often for our liking the people actually executing the experiments would 'take things into their own hands' in experimental execution and not follow to the letter of the law what we wanted done.

 

For example, randomization of trials often fell to the wayside because the operator found resetting factors in a random fashion a pain and would group the 'pain to change factor' runs together to make the experiment, well, less painful. Ouch...there goes randomization...and we didn't specify a split plot design. So right there we open ourselves up to potential for 'bad data'.

 

So what I started insisting on was to be present when the experiment was run and watch and try to help along the way with the operators. I found this had several benefits...one is I often saw something that might influence our results in a way that we didn't want to have happen...like the scenario above. I'd insist on executing the experiment in the random order in which we specified...and not taking matters into your own hands. On a teamwork side...it also helped me build street cred with the operators because me, the white shirted statistician, was willing to go into the lab or out on the shop floor, get his hands dirty, and sometimes work the night or graveyard shift right along with all the other working stiffs...and not sit in my ivory office all day long issuing proclamations to the proletariate.

jmpkat
Level I

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

Thank you very much for your inputs, completely understand operators taking matters into their hands and "inherent bias" for one parameter over others.  I can connect to that and the suggestion is well taken.

statman
Super User

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

I will share my thoughts though you may not agree with them.  Replicated designs are only one way to include noise in the experiment.  Un-replicated designs can be quite useful.  I use Factor Relationship Diagrams to graphically display the relationships between the design factors and the noise factors in an experiment.  This way you understand what factors make up the basis for comparison and you can use scientific/engineering judgement to determine if that noise is representative of future conditions and it might stimulate thoughts and hypotheses development about the potential effect of such noise. Partitioning the noise can be quite useful for increasing the designs precision while not negatively effecting inference space.

 

Your statement "Now the goal of the study is to do experiments in the lab (on small scales mimicking large scale operation as much as possible), systematically varying factors A, B and C while keeping all other noise factors constant." doesn't make any sense to me.  First, the goal should be IMHO, to understand the causal structure affecting the chemical purity! Second, noise, by definition is the set of factors you ARE NOT WILLING/ABLE TO CONTROL.  The way you are collecting data is to use experimentation (rather than directed sampling where no manipulation is done and you rely on partitioning the sources via how you sample and rationally subgroup the data). Holding noise constant for the entire experiment is a terrible idea unless you intend to hold it constant forever (and pay the added expense of controlling noise), which, of course makes those factors controllable, not noise.  

As you probably know, Fisher discovered this was a bad idea as the resultant inference space was so narrow as to be useless when faced with the reality that noise varies and may impact the results (see Fishers papers on his agricultural experiments). He introduced the technique of blocking to handle the noise so the noise is held constant within the block and purposefully varied between the block so you could both expand the inference space and increase the precision of the design simultaneously.  Also allowing for estimation of block-by-factor interactions, a measure of robustness.

 

"Block what you can, randomize what you cannot" G.E.P. Box

 

Shewhart suggested in the Shewhart Cycle to "carry out the study, preferably on a small scale", I believe what is intended is you want to simulate the real world conditions in the small scale study.  I think the way you do this is to exaggerate the effects of noise so as to capture the long-term variation of the noise in a very short time period (the experiment).

 

That being said, to answer your question, I use a pseudo Bayesian philosophy.  I have the scientists/engineers predict the data for each treatment of the experiment (á priori, of course and biased by the engineers hypotheses), predict ALL possible outcomes (to mitigate the bias) and predict what actions you would take for ALL possible outcomes.  This gives context to the experimental analysis and provides a practical approach to evaluating the data.  For example, if you get a result from a treatment that is wildly different than what was expected, can that data be explained (hypothesis or a modified hypothesis), if not, perhaps question the execution of that treatment or question what else was going on during the running of that treatment (and if you predict why data would not match, this increases the chance of your finding the random effect). 

Paraphrased from the quote of Louis Pasteur "Chance favors the prepared mind".

"All models are wrong, some are useful" G.E.P. Box
jmpkat
Level I

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

I agree that my statement "keep all noise factors constant" is not correct in the true sense.  But my definition of "noise" is different in this context.  Let me give an example.  

 

Let us say my reaction is not super sensitive to water.  We don't have to measure water content of the reaction medium.  However, if there is ton of water (something like 20% water), it can impact the quality.  Again significant enough to skew the response.  If the chemist performs the chemical reaction in a properly dried lab scale reactor (common practice), it should not be a problem.  But let us say the chemist was doing a shoddy job in drying the lab scale reactor and he took a reactor that is significantly wet and carried one (or several) of the experiments in the design table, then the result is not reliable.  In a good laboratory practice, this should not happen, but I am talking about non-ideal conditions.  This is not a factor we would study, but it could have significant impact if chemists do not following instructions properly.  When I said "noise", I meant a situation like this.  Now, if the experiments in the design table were performed with some properly dried reactors and some partially dried and some not dried at all; while the instruction is to do all experiments in properly dried reactors, the results can be all over. 

 

In this scenario, is it possible to figure out the data is not reliable based on analysis?

statman
Super User

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

I believe I already provided an answer to your question.  It may not be glamorous, but I believe interpretation through the eyes of the scientist/engineer is a rational and logical approach.  If you can evaluate whether the results seem reasonable or are "all over" by practical analysis (which is greatly enhanced by predicting the results before you get them), then you should have clues something was not done properly.  I suppose you could develop a model and look at the residuals, RMSE, R-square-Rsquare-Adj delta, p-values, et. al. to determine how poorly the model describes the process.

"All models are wrong, some are useful" G.E.P. Box
statman
Super User

Re: Screening designs - How do we know if the effects are not significant or if the whole data is not good (significant variation) without doing replicates?

First, welcome to the community.  We'll try to give you some ideas, and feel free to respond with more information regarding your situation.

Pete brought up some excellent points.  Here are my additional thoughts:

1. Robustness to me means consistent performance of the model over changing noise (quantitatively the absence of noise-by-model interactions).  Noise could be ambient conditions, purity of the incoming chemicals, cleanliness of the vat, measurement system errors, the variation of X's not specifically controlled, the variation of X's within the levels of control, etc., or as I define noise: the factors you are not willing to manage/control). 

2. Could you collect data for the response variable while simultaneously recording the values of the X's.  Use this data to perform linear regression.  You would hope the variation in the response variable is relatively small and there is no relationship with the X's. This is due to the likelihood the  X's also don't vary much (which is why there is little evidence of a causal relationship even when in truth there may be).

3. What they are really asking (I suppose) is not that you do a screening design, as the intent of this is to boldly change X's and discover potentially causal relationships, but something more like tolerance design.  Perhaps it is semantics, but what they want to know is if the normal, random variation of the X's impacts the response variable.  So choose levels that are representative of the extremes of the natural variation of the x's and run some sort of factorial design.  I do think you should employ some strategy to handle other sources of noise as well (blocking, repetition, split-plots, et. al.)

4. I'm not sure what you mean by "If the experimenter collected data that is not good".  Does this mean the measurement systems are suspect?  In which case you should study the measurements systems and if conducting an experiment, take repeated measures to estimate the measurement component and perhaps average it's variation to increase the precision of the design.

"All models are wrong, some are useful" G.E.P. Box