BookmarkSubscribeRSS Feed
Choose Language Hide Translation Bar
bradleyjones

Staff

Joined:

Mar 30, 2012

Proper and improper use of Definitive Screening Designs (DSDs)

In 2011, my colleague Prof. Chris Nachtsheim and I introduced Definitive Screening Designs (DSDs) with a paper in the Journal of Quality Technology. A year later, I wrote a JMP Blog post describing these designs using correlation cell plots. Since their introduction, DSDs have found applications in areas as diverse as paint manufacturing, biotechnology, green energy and laser etching.

When a new and exciting methodology comes along, there is a natural inclination for leading-edge investigators to try it out. When these investigators report positive results, it encourages others to give the new method a try as well.

I am a big fan of DSDs, of course, but as a co-inventor I feel a responsibility to the community of practitioners of design of experiments (DOE) to be clear about their intended use and possible misuse.

So when should I use a DSD?

As the name suggests, DSDs are screening designs. Their most appropriate use is in the earliest stages of experimentation when there are a large number of potentially important factors that may affect a response of interest and when the goal is to identify what is generally a much smaller number of highly influential factors.

Since they are screening experiments, I would use a DSD only when I have four or more factors. Moreover, if I had only four factors and wanted to use a DSD, I would create a DSD for six factors and drop the last two columns. The resulting design can fit the full quadratic model in any three of the four factors.

DSDs work best when most of the factors are continuous. That is because each continuous factor has three levels, allowing an investigator to fit a curve rather than a straight line for each continuous factor.

When is using a DSD inappropriate?

Below, the optimal split-plot design dramatically outperforms the Definitive Screening Design created by sorting the hard-to-change factor, wp. See point 4) below.

 Graph shows the comparative power of an optimal split-plot design vs. a Definitive Screening Design created by sorting the hard-to-change factor.

1) When there are constraints on the design region

An implicit assumption behind the use of DSDs is that it is possible to set the levels of any factor independently of the level of any other factor. This assumption is violated if a constraint on the design region makes certain factor combinations infeasible. For example, if I am cooking popcorn, I do not want to set the power at its highest setting while using a long cooking time. I know that if I do that, I will end up with a charred mess.

It might be tempting to draw the ranges of the factors inward to avoid such problems, but this practice reduces the DSD’s power to detect active effects. It is better to use the entire feasible region even if the shape of that region is not cubic or spherical.

2) When some of the factors are ingredients in a mixture

Similarly, using a DSD is inappropriate if two or more factors are ingredients in a mixture. If I raise the percentage of one ingredient, I must lower the percentage of some other ingredient, so these factors cannot vary independently by their very nature.

3) When there are categorical factors with more than two levels

DSDs can handle a few categorical factors at two levels, but if most of the factors are categorical, using a DSD is inefficient. Also, DSDs are generally an undesirable choice if categorical factors have more than two levels. A recent discussion in The Design of Experiment (DOE) LinkedIn group involved trying to modify a DSD to accommodate a three-level categorical factor. Though this is possible, it required using the Custom Design tool in JMP treating the factors of the DSD as covariate factors and adding the three-level categorical factor as the only factor having its levels chosen by the Custom Design algorithm.

4) When the DSD is run as a split-plot design

It is also improper to alter a DSD by sorting the settings of one factor so that the resulting design is a split-plot design. For the six factor DSD, the sorted factor would have only three settings. There would be five runs at the low setting, three runs at the middle setting and give runs at the high setting. Using such a design would mean that inference about the effect of the sorted factor would be statistically invalid.

5) When the a priori model of interest has higher order effects

For DSDs, cubic terms are confounded with main effects, so identifying a cubic effect is impossible.

Regular two-level fractional factorial designs and Plackett-Burman designs are also inappropriate for most of the above cases. So, they are not a viable alternative.

What is the alternative to using a DSD in the above cases?

For users of JMP, the answer is simple: Use the Custom Design tool.

The Custom Design tool in JMP can generate a design that is built to accommodate any combination of the scenarios listed above. The guiding principle behind the Custom Design tool is

“Designs should fit the problem rather than changing the problem to suit the design.”

Final Thoughts

DSDs are extremely useful designs in the scenarios for which they were created. As screening designs they have many desirable characteristics:

1) Main effects are orthogonal.

2) Main effects are orthogonal to two-factor interactions (2FIs) and quadratic effects.

3) All the quadratic effects of continuous factors are estimable.

4) No 2FI is confounded with any other 2FI or quadratic effect although they may be correlated.

5) For DSDs with 13 or more runs, it is possible to fit the full quadratic model in any three-factor subset.

6) DSDs can accommodate a few categorical factors having two levels.

7) Blocking DSDs is very flexible. If there are m factors, you can have any number of blocks between 2 and m.

8) DSDs are inexpensive to field requiring only a minimum of 2m+1 runs.

9) You can add runs to a DSD by creating a DSD with more factors than necessary and dropping the extra factors. The resulting design has all of the first seven properties above and has more power as well as the ability to identify more second-order effects.

In my opinion, the above characteristics make DSDs the best choice for any screening experiment where most of the factors are continuous.

However, I want to make it clear that using a DSD is not a panacea. In other words, a DSD is not the solution to every experimental design problem.

 

21 Comments
Community Member

Ron Andrews wrote:

I found an application for a DSD with a mixture that worked. I needed to confirm the weigh-out specs for a mixture of monomers used to make contact lenses. This is a true mixture, not a solution so a DSD would not be a good choice in most cases. The thing that made it work were the small variations from the nominal. The nominal formula was used as the center point. Each component was varied +/- 1% of the nominal. To analyze the experiment I calculated the actual concentrations for each mix. There was a significant correlation between the largest two components in the mixture, but the design allowed separate estimates of the two factors.

I've submitted an abstract for a discovery summit presentation that includes this as one of the examples of mixture design alternatives. Whether it gets accepted or not, I can provide more details.

Community Member

Paul Prew wrote:

Hello Bradley, this post is making me think about appropriate designs for 4-6 factors. Specifically, last week, I had a chemist who wanted to investigate 5 continuous factors. Preliminary one-factor-at-a-time testing told her they were important. She knew nothing about DOE, but her supervisor told her it was a more efficient path -- go talk to a statistician.

We were in a gray area where a pure screening design would have seemed like a step backwards, inefficient. So we chose to design for a quadratic model, leading to a custom design of 25 runs with a few replicates.

Thanks, Paul

A 5-factor DSD would have been far fewer runs, even designing for 6 or 7 factors and dropping the extra columns. How could I use JMP to evaluate the DSD as a good design for fitting a 2nd order model? What specific Design Evaluation tools would you recommend to compare the DSD w/ the custom design?

Community Member

Bradley Jones wrote:

Paul,

The custom design with a few replicates is a very safe response surface design. My experience is that generally there are few large 2nd order effects. So building a design that can estimate every 2nd order effect may end up being overkill. The 6 factor DSD only requires 13 runs and can reliably identify two or three 2nd order effects. That would save you almost 50% of the cost.

If instead you used the 8 factor DSD and ignored the last 3 columns, you would have a 17 run design and you should be able to identify four or five 2nd order effects.

Community Member

Ewoud Schuring wrote:

Hello Bradley,

Interesting blog, actually the whole DSD is very interesting to me. However I do have a question.

What is the limit of the number of categorical factors in a DSD? At the moment I'm trying to create a DSD with 3 continuous and 3 categorical factors. Is this a good idea? What are the risks? Or is a 1/1 ratio of continuous vs categorical resulting in a non-efficient design?

Staff

Bradley Jones wrote:

You can add 3 categorical factors to a DSD without making the design inefficient. Below is a reference to a paper that Chris Nachtsheim and I published on this subject.

Jones, Bradley, and Nachtsheim, C. J. (2013) â Definitive Screening Designs with Added Two-Level Categorical Factorsâ , Journal of Quality Technology, 45:2, 120-129.

Community Member

Jo chotix wrote:

I was thinking of using DSD to design a treatment plan for an in vivo mouse study. I have 4 factors that alter the concentration of the treatment (eg conc drug, excipient blends etc) and one factor focused on age of the mice. I was wondering if this is something you would recommend DSD for?

Staff

Bradley Jones wrote:

It might be appropriate to use a DSD but I am concerned that the 4 factors that alter the concentration of the treatment might be viewed as ingredients in a mixture. If so, that would make using a DSD inappropriate with one exception. If the majority ingredient in the mixture is a filler that is inactive, then the other ingredients may be varied independently. In that case a DSD would be OK to use.

Community Member

Jo Chotix wrote:

Looks like we will be able to vary each of the components independently. Should I be planning for some validation runs? How does one plan for those/factor into design (number of runs, what design points for validation etc)? How is this different from augmenting the design? Thanks!

Community Member

Steven Moore wrote:

So....I have been using Definitive Screening Designs as well as Custom Designs for quite some time. In my opinion, there is no reason to use the traditional Factorial Designs any more. am I correct, or am I missing something?

Staff

Bradley Jones wrote:

I believe in the routine use of the Custom Designer for most problems in industrial design of experiments. When almost all of the factors are continuous and can be varied independently over their entire range, then Definitive Screening Designs are a good choice.

I sometimes use Full Factorial Designs when I am doing Monte Carlo computer simulation studies where the only expense is computer time. I agree that there are now better choices than the regular two-level fractional factorial designs.

Staff

Bradley Jones wrote:

It is always good to plan for validation runs. These are runs that you do after having run the experiment and analyzed the data. The result of the analysis is a model, which you can use to make response predictions at factor settings that were not run in the experiment. Considering your goal for a response you can choose to run your verification runs at a factor setting (or settings) where your model predicts the most improvement.

This is different from design augmentation, which extends the range of the factors, adds terms to the model, or adds more runs to reduce the uncertainty about the estimates in the current model.

Community Member

Great post. However I have one question that puzzles me. You mention that you can easily add extra runs by taking a higher m desing and drop the extra columns. I was wondering if there was any easy way to "append" runs into design. For example I have done a m=5 DSD runs and then I decide I would need more power and would like to add some extra runs afterwards. How  do I do that? (For example if I take the m=7 and drop the last two columns and compare it to the m=5 then I won't get the same rows anymore). Thanks!

Community Member

DSD's can definitely run into trouble when there are too many factors and interactions! I tried one in a catalyst system which had been through generations of optimization and had many functioning catalyst components and interactions. I wound up having to add quite a few runs.

Community Trekker

Are there examples published where columns have been dropped as discussed below?

 

If instead you used the 8 factor DSD and ignored the last 3 columns, you would have a 17 run design and you should be able to identify four or five 2nd order effects.

 

 

Staff

http://dx.doi.org/10.1080/00401706.2016.1234979

The above is the URL for a paper in the journal,Technometrics, by Chris Nachtsheim and me titled Effective Design-Based Model Selection for Definitive Screening Designs.

 

This paper provides strong motivation for adding what we call "Fake Factors" (i.e. factors that are used in creating the design but are not used when running the experiment. In JMP 13 the DSD design tool calls this "Extra Runs" with a default value of 4. However, you can change this field back to 0 to get the minimum run DSD. Adding these extra runs allows for a powerful alternative analytical approach.

Community Trekker

Thanks Dr. Jones - This issue is not yet available on the ASQ Technometrics site.  I'll look for it to be published soon.  

 

Richard

Community Manager

Thank you Brad, informative post, and good responses, helped me learn a lot more about DSDs. 

Community Trekker

Dr. Jones,  I am a long time JMP user (since the first version) and want to thank you for your contributions to the industry.   I have been involved in the design and analysis of over 16,000 experiments across multiple industries.  Perhaps this is the wrong forum, but I have some questions/comments (I hope not to offend), and would value your opinions.

1. When it comes to screening designs, one of the most common failure modes (especially for novice experimentors) is levels that are not set fart enough apart (bold enough).  Their effects are therfore not realized and those factors may even be dropped from further study (my hypothesis is engineers are reluctant to create too much variation, they are still looking for the solution rather than understanding causal relationships).  It seems to me that increasing the number of levels to three further complicates this failure mode?

2. Early in design, many of the factors are qualitative.  It seems the restriction on adding too many qualitative factors to the design is a burden.  Perhaps this strategy is more applicable as a latter iteration in an sequential study?

3. Often screening designs are meant to get direction and filter out insignificant or perhaps more likely identify significant factors for further study. Are quadratic equations really necessary to accomplish this? OK, I suppose there could be the case where the levels are set too bold and therfore significant effects are missed but that could be remedied fairly easy with center points.  If the quadratic is essentially the departure from linear, if you are in a design space fairly far from optimum wouldn't you be more interested in the linear effect first?  The quadratic equation at the base of the mountain may not get you to the top of the mountain.

4. Additionally and perhaps most importantly, the strategy to handle noise may be more important than the resolution of the design factors (it is as you know a constant struggle for efficient use of resources between design factor resolution and the resoultion of the noise effects).  If a Randomized Complete Block Design (RCBD) is used do we really need some of the higher order block-by-effect interactions?  If a Balanced Incomplete Block design is used how easy is it to understand the aliasing?

5.  Lastly, and perhaps of least importance to the practtioner, how are DSD's taught?  Using traditional fractional factorial designs with identifiable aliasing and ideas such as foldover to de-alias are straight forward.  Insilling the mindset and practice of iteration is paramount.  How do DSD's fit in an iterative, scientific approach to experimentation?

I thank you in advance for your thoughts.

If you prefer to take this discusion off line that would be fine with me.  You can email me directly at statman at comcast dot net

 

Cheers,

 

Bill Ross

Staff

Dr. Ross - I am very impressed with your 16,000 DOE experiences. Thank you for your long term use of JMP. You have been a JMPer for a longer time than I have!

 

Below I address your five sets of questions and concerns. 

1) You point out that novice experimenters tend to set the levels of continuous factors too close to each other. I have seen this behavior as well. I believe that engineers think that the world is nonlinear. They also know from calculus that the deviation from linearity is small if the change is also small. The problem is when the changes in a factor are small, noise in the response can propagate into large noise in the estimated slope of the line making it hard to identify the active factors. So, in teaching DOE we encourage students to be bold in varying the continuous factors. I believe that having low, middle, and high values for each factors, as DSDs provide, can make engineers more confident to spread out the factor's levels. This is because the DSD can detect strong curvature if it exists, which obviates worries about nonlinearity.

2. I mention in my blog above that DSDs are appropriate screening designs when most of the factors are continuous. I would be reluctant to use a DSD when there are more than three two-level categorical factors. When most, or all, of the factors are categorical at two-levels, I would recommend Efficient Foldover Designs. Here is the reference to the article about these designs.  Anna Errore, Bradley Jones, William Li & Christopher J. Nachtsheim (2017) Benefits and Fast Construction of Efficient Two-Level Foldover Designs, Technometrics, 59:1, 48-57, DOI: 10.1080/00401706.2015.1124052

I also think that investigators could try harder to come up with quantitative measures that describe the difference between the two levels of a categorical factor, thus successfully turning a categorical factor into a more meaningful continuous factor.

3. You argue that center points can detect curvature. This is true, but they cannot identify the factor(s) causing the curvature. This ambiguity forces follow-up experimentation, which is unnecessary if a DSD is used. I have seen one case study where a strong quadratic effect had its minimum inside the region of experimentation. A minimum response was desirable and the sweet spot here would be undetectable with a two-level design. Another case study found a strong quadratic effect where the maximum response was inside the region of experimentation. Here, a maximum response was desirable and the sweet spot would, again, be undetectable with a two-level design. The cost for including the ability to estimate all the quadratic effects in a DSD is a 10% (or  less) increase in the length of confidence intervals on the main effects. This seems a small price to pay for the information gain about 2nd order effects.

4. You point out, correctly, that sometimes identifying the source of noise may be more important than estimating fixed effects. In such cases, it is useful to replicate design points so that it is possible to identify a factor affecting the variance of the response while simultaneously evaluating the effects of the factors on the mean response. One could replicate an entire DSD at the cost of roughly four times as many runs as there are factors. You mention blocking in this point. Blocking can identify and quantify one source of variability and remove its effect from the run-to-run variation. When chaning environmental conditions introduce noise in the responses, blocking is a useful remedy. DSDs can be blocked more flexibly than two-level fractional factorial or full factorial designs. Below is a reference to a paper describing how to block DSDs. 

 Bradley Jones & Christopher J. Nachtsheim (2015): Blocking Schemes for Definitive Screening Designs, Technometrics, DOI: 10.1080/00401706.2015.1013777

5. Your final concern deals with teaching DSDs and concerns about an iterative approach to experimentation. I do not advertise DSDs as a "one-shot" experiments. It is true that the aliasing structure of traditional fractional factorial designs is straightforward and fairly easy to explain. However, their complete aliasing of potentially important effects is undesirable in my view. See 

 Bradley Jones (2016) 21st century screening experiments: What, why, and how, Quality Engineering, 28:1, 98-106, DOI: 10.1080/08982112.2015.1100462

The above article was a discussion paper with three discussants. My rejoinder dealt with many issues including sequential design following an initial DSD study - reference below.

 Bradley Jones (2016) Rejoinder, Quality Engineering, 28:1, 122-126, DOI: 10.1080/08982112.2015.1100468

The reference below deals with the relative benefit of using an initial DSD even when the predicted optimum response is outside the region of experimentation. It also provides a powerful follow-up design approach that out performs steepest ascent.

Christopher J. Nachtsheim & Bradley Jones (2018) Design augmentation for response optimization and model estimation, Quality Engineering, 30:1, 38-51, DOI: 10.1080/08982112.2017.1382298

I do not think it is necessary to teach a practitioner how to create a DSD. Many software packages can do this. I think that familiarizing investigators with the correlation cell plot is useful. JMP provides many diagnostic tools for evaluating and comparing designs. These are very useful and teaching their use in short courses and university courses would benefit the community of DOE practitioners. It is useful for practitioners to know how to take advantage of the special geometry of a DSD to fit appropriate models. This is addressed in the reference below.

 Bradley Jones & Christopher J. Nachtsheim (2016): Effective Design-Based Model Selection for Definitive Screening Designs, Technometrics, DOI: 10.1080/00401706.2016.1234979

 

I would be happy to continue this conversation and send you pdfs of the references. You can contact me at bradley.jones@jmp.com

Community Trekker

Dr, Jones,

 

Thank you for your thorough and thoughtful responses.  I appreciate your time and consideration.  I will continue the discussion directly with you.

 

Cheers,

Bill

Community Member

Dr. Jones,

 

I'm using DSD for quite some time now, with quite some succes. However, I'm now looking into using DSD for binary responses (growth/no growth). Can this be done with DSD and how would you analyse the response, since it is not built in the DSD analysis algorithm I guess? Or would you use other types of designs?

 

Best,

 

Ewoud Schuring