Subscribe Bookmark
bradleyjones

Staff

Joined:

Mar 30, 2012

Proper and improper use of Definitive Screening Designs (DSDs)

In 2011, my colleague Prof. Chris Nachtsheim and I introduced Definitive Screening Designs (DSDs) with a paper in the Journal of Quality Technology. A year later, I wrote a JMP Blog post describing these designs using correlation cell plots. Since their introduction, DSDs have found applications in areas as diverse as paint manufacturing, biotechnology, green energy and laser etching.

 

When a new and exciting methodology comes along, there is a natural inclination for leading-edge investigators to try it out. When these investigators report positive results, it encourages others to give the new method a try as well.

 

I am a big fan of DSDs, of course, but as a co-inventor I feel a responsibility to the community of practitioners of design of experiments (DOE) to be clear about their intended use and possible misuse.

 

So when should I use a DSD?

 

As the name suggests, DSDs are screening designs. Their most appropriate use is in the earliest stages of experimentation when there are a large number of potentially important factors that may affect a response of interest and when the goal is to identify what is generally a much smaller number of highly influential factors.

 

Since they are screening experiments, I would use a DSD only when I have four or more factors. Moreover, if I had only four factors and wanted to use a DSD, I would create a DSD for six factors and drop the last two columns. The resulting design can fit the full quadratic model in any three of the four factors.

 

DSDs work best when most of the factors are continuous. That is because each continuous factor has three levels, allowing an investigator to fit a curve rather than a straight line for each continuous factor.

 

When is using a DSD inappropriate?

Graph shows the comparative power of an optimal split-plot design vs. a Definitive Screening Design created by sorting the hard-to-change factor.

Here, the optimal split-plot design dramatically outperforms the Definitive Screening Design created by sorting the hard-to-change factor, wp. See point 4) below.

 

1) When there are constraints on the design region

 

An implicit assumption behind the use of DSDs is that it is possible to set the levels of any factor independently of the level of any other factor. This assumption is violated if a constraint on the design region makes certain factor combinations infeasible. For example, if I am cooking popcorn, I do not want to set the power at its highest setting while using a long cooking time. I know that if I do that, I will end up with a charred mess.

 

It might be tempting to draw the ranges of the factors inward to avoid such problems, but this practice reduces the DSD’s power to detect active effects. It is better to use the entire feasible region even if the shape of that region is not cubic or spherical.

 

2) When some of the factors are ingredients in a mixture

 

Similarly, using a DSD is inappropriate if two or more factors are ingredients in a mixture. If I raise the percentage of one ingredient, I must lower the percentage of some other ingredient, so these factors cannot vary independently by their very nature.

 

3) When there are categorical factors with more than two levels

 

DSDs can handle a few categorical factors at two levels, but if most of the factors are categorical, using a DSD is inefficient. Also, DSDs are generally an undesirable choice if categorical factors have more than two levels. A recent discussion in The Design of Experiment (DOE) LinkedIn group involved trying to modify a DSD to accommodate a three-level categorical factor. Though this is possible, it required using the Custom Design tool in JMP treating the factors of the DSD as covariate factors and adding the three-level categorical factor as the only factor having its levels chosen by the Custom Design algorithm.

 

4) When the DSD is run as a split-plot design

 

It is also improper to alter a DSD by sorting the settings of one factor so that the resulting design is a split-plot design. For the six factor DSD, the sorted factor would have only three settings. There would be five runs at the low setting, three runs at the middle setting and give runs at the high setting. Using such a design would mean that inference about the effect of the sorted factor would be statistically invalid.

 

5) When the a priori model of interest has higher order effects

 

For DSDs, cubic terms are confounded with main effects, so identifying a cubic effect is impossible.

 

Regular two-level fractional factorial designs and Plackett-Burman designs are also inappropriate for most of the above cases. So, they are not a viable alternative.

 

What is the alternative to using a DSD in the above cases?

 

For users of JMP, the answer is simple: Use the Custom Design tool.

 

The Custom Design tool in JMP can generate a design that is built to accommodate any combination of the scenarios listed above. The guiding principle behind the Custom Design tool is

 

“Designs should fit the problem rather than changing the problem to suit the design.”

 

Final Thoughts

 

DSDs are extremely useful designs in the scenarios for which they were created. As screening designs they have many desirable characteristics:

1) Main effects are orthogonal.

2) Main effects are orthogonal to two-factor interactions (2FIs) and quadratic effects.

3) All the quadratic effects of continuous factors are estimable.

4) No 2FI is confounded with any other 2FI or quadratic effect although they may be correlated.

5) For DSDs with 13 or more runs, it is possible to fit the full quadratic model in any three-factor subset.

6) DSDs can accommodate a few categorical factors having two levels.

7) Blocking DSDs is very flexible. If there are m factors, you can have any number of blocks between 2 and m.

8) DSDs are inexpensive to field requiring only a minimum of 2m+1 runs.

9) You can add runs to a DSD by creating a DSD with more factors than necessary and dropping the extra factors. The resulting design has all of the first seven properties above and has more power as well as the ability to identify more second-order effects.

 

In my opinion, the above characteristics make DSDs the best choice for any screening experiment where most of the factors are continuous.

 

However, I want to make it clear that using a DSD is not a panacea. In other words, a DSD is not the solution to every experimental design problem.

 

13 Comments
Community Member

Ron Andrews wrote:

I found an application for a DSD with a mixture that worked. I needed to confirm the weigh-out specs for a mixture of monomers used to make contact lenses. This is a true mixture, not a solution so a DSD would not be a good choice in most cases. The thing that made it work were the small variations from the nominal. The nominal formula was used as the center point. Each component was varied +/- 1% of the nominal. To analyze the experiment I calculated the actual concentrations for each mix. There was a significant correlation between the largest two components in the mixture, but the design allowed separate estimates of the two factors.

I've submitted an abstract for a discovery summit presentation that includes this as one of the examples of mixture design alternatives. Whether it gets accepted or not, I can provide more details.

Community Member

Paul Prew wrote:

Hello Bradley, this post is making me think about appropriate designs for 4-6 factors. Specifically, last week, I had a chemist who wanted to investigate 5 continuous factors. Preliminary one-factor-at-a-time testing told her they were important. She knew nothing about DOE, but her supervisor told her it was a more efficient path -- go talk to a statistician.

We were in a gray area where a pure screening design would have seemed like a step backwards, inefficient. So we chose to design for a quadratic model, leading to a custom design of 25 runs with a few replicates.

Thanks, Paul

A 5-factor DSD would have been far fewer runs, even designing for 6 or 7 factors and dropping the extra columns. How could I use JMP to evaluate the DSD as a good design for fitting a 2nd order model? What specific Design Evaluation tools would you recommend to compare the DSD w/ the custom design?

Community Member

Bradley Jones wrote:

Paul,

The custom design with a few replicates is a very safe response surface design. My experience is that generally there are few large 2nd order effects. So building a design that can estimate every 2nd order effect may end up being overkill. The 6 factor DSD only requires 13 runs and can reliably identify two or three 2nd order effects. That would save you almost 50% of the cost.

If instead you used the 8 factor DSD and ignored the last 3 columns, you would have a 17 run design and you should be able to identify four or five 2nd order effects.

Community Member

Ewoud Schuring wrote:

Hello Bradley,

Interesting blog, actually the whole DSD is very interesting to me. However I do have a question.

What is the limit of the number of categorical factors in a DSD? At the moment I'm trying to create a DSD with 3 continuous and 3 categorical factors. Is this a good idea? What are the risks? Or is a 1/1 ratio of continuous vs categorical resulting in a non-efficient design?

Bradley Jones wrote:

You can add 3 categorical factors to a DSD without making the design inefficient. Below is a reference to a paper that Chris Nachtsheim and I published on this subject.

Jones, Bradley, and Nachtsheim, C. J. (2013) â Definitive Screening Designs with Added Two-Level Categorical Factorsâ , Journal of Quality Technology, 45:2, 120-129.

Community Member

Jo chotix wrote:

I was thinking of using DSD to design a treatment plan for an in vivo mouse study. I have 4 factors that alter the concentration of the treatment (eg conc drug, excipient blends etc) and one factor focused on age of the mice. I was wondering if this is something you would recommend DSD for?

Bradley Jones wrote:

It might be appropriate to use a DSD but I am concerned that the 4 factors that alter the concentration of the treatment might be viewed as ingredients in a mixture. If so, that would make using a DSD inappropriate with one exception. If the majority ingredient in the mixture is a filler that is inactive, then the other ingredients may be varied independently. In that case a DSD would be OK to use.

Community Member

Jo Chotix wrote:

Looks like we will be able to vary each of the components independently. Should I be planning for some validation runs? How does one plan for those/factor into design (number of runs, what design points for validation etc)? How is this different from augmenting the design? Thanks!

Community Member

Steven Moore wrote:

So....I have been using Definitive Screening Designs as well as Custom Designs for quite some time. In my opinion, there is no reason to use the traditional Factorial Designs any more. am I correct, or am I missing something?

Bradley Jones wrote:

I believe in the routine use of the Custom Designer for most problems in industrial design of experiments. When almost all of the factors are continuous and can be varied independently over their entire range, then Definitive Screening Designs are a good choice.

I sometimes use Full Factorial Designs when I am doing Monte Carlo computer simulation studies where the only expense is computer time. I agree that there are now better choices than the regular two-level fractional factorial designs.

Bradley Jones wrote:

It is always good to plan for validation runs. These are runs that you do after having run the experiment and analyzed the data. The result of the analysis is a model, which you can use to make response predictions at factor settings that were not run in the experiment. Considering your goal for a response you can choose to run your verification runs at a factor setting (or settings) where your model predicts the most improvement.

This is different from design augmentation, which extends the range of the factors, adds terms to the model, or adds more runs to reduce the uncertainty about the estimates in the current model.

Community Member

Great post. However I have one question that puzzles me. You mention that you can easily add extra runs by taking a higher m desing and drop the extra columns. I was wondering if there was any easy way to "append" runs into design. For example I have done a m=5 DSD runs and then I decide I would need more power and would like to add some extra runs afterwards. How  do I do that? (For example if I take the m=7 and drop the last two columns and compare it to the m=5 then I won't get the same rows anymore). Thanks!

Community Member

DSD's can definitely run into trouble when there are too many factors and interactions! I tried one in a catalyst system which had been through generations of optimization and had many functioning catalyst components and interactions. I wound up having to add quite a few runs.