It’s World Statistics Day! To honor the theme of the day, the JMP User Community is having conversations about the importance of trust in statistics and data. And we want to hear from you! Tell us the steps you take to ensure that your data is trustworthy.
Choose Language Hide Translation Bar
Staff
Introducing definitive screening designs

In my two previous posts, I introduced the correlation cell plot for design evaluation and then showed how to use the plot to compare designs. Here, I want to use the same plot to show why definitive screening designs are, well, definitive.

For a complete technical description of definitive screening designs, you can read "A Class of Three-Level Designs for Definitive Screening in the Presence of Second-Order Effects" -- an article I co-wrote with Chris Nachtsheim of the University of Minnesota. Chris and I were delighted to learn recently we had won the American Society for Quality's 2012 Brumbaugh Award for our paper. This award is presented to the author(s) of the paper that has made the largest single contribution to the development of industrial application of quality control. The paper was published in January 2011 in the Journal of Quality Technology, and you can read it via the JMP website.

### What is a definitive screening design?

The most notable way that definitive screening designs are different from standard designs is that all the factors are numeric and are tested at three levels. A second distinctive feature of a definitive screening design is that it is a self-foldover. That is, the runs of the design come in pairs that “mirror” each other. Suppose we encode the low setting of a factor as “–“, the high setting as “+” and the middle setting as “0”. Then, if one run of a foldover pair has factor settings encoded “+ 0 – + – +”, the other run has factor settings encoded “– 0 + – + –”. Each pair of runs has one factor at its middle value and all the others at their high or low values. One run is at the center of the design region with all the factors at their middle setting. Table 1 shows a definitive screening design for eight factors. Notice that it has one more than twice as many runs as there are factors, that is, 17 runs.

Table 1 Definitive screening design for eight three-level factors.

### So what makes this design so special?

To see why the design in Table 1 is fantastic, let us use the correlation cell plot in Figure 1. Our potential model terms are all the main effects, two-factor interactions and quadratic effects. Note that only the cells on the diagonal of the plot are pure red. That means that none of the model terms are confounded with each other.

Figure 1 Correlation plot for definitive screening design.

The last eight columns of the cell plot show the quadratic effect terms. These effects are only mildly correlated with each other (|r| = 0.19). Each quadratic effect is uncorrelated with a two-factor interaction involving its factor. That is, the quadratic effect of factor A is uncorrelated with the AB interaction. Other two-factor interactions have an absolute correlation of 0.37. It turns out that all eight quadratic effects are estimable with the definitive screening design. The main effects of the design are all orthogonal to each other and to all the second order terms (two-factor interactions and quadratic effects).

The two-factor interactions have pairwise correlations that can take one of three values. The pink cells represent absolute correlations of two-thirds. The light blue cells represent correlations of only one-sixth. The pure blue cells show uncorrelated interaction pairs.

Let us compare this design and plot to the standard screening design for eight factors. That design is the minimum aberration fractional factorial design. This design is in Table 2, which has one added center run to make both designs have 17 runs including one center run.

Table 2 Standard screening design with one center run.

Figure 2 shows the cell plot for the fractional factorial design. The most notable feature of this cell plot is that all the cells are either pure blue or red. That is, every pair of columns is either completely uncorrelated or completely confounded.

Figure 2 Correlation plot for the standard screening design.

Note the block of red cells in the lower right. These red cells indicate that all the quadratic effects are confounded with each other. With one added center run, the standard screening design has some ability to detect very strong nonlinearity in the factor/response relationship. However, there is no way to determine which factor is causing the nonlinearity. By contrast, the definitive screening design can separately estimate the nonlinear effect of each factor.

Each two-factor interaction in the fractional factorial design is confounded with three other two-factor interactions. This means that if any two-factor interaction is active, the analysis can only indicate that there are four possible interactions that could explain the observed effect. Narrowing down this field to one interaction requires further experimentation. By contrast, the definitive screening design can reliably resolve any two-factor interaction that is large compared to its standard error.

### Why are definitive screening designs definitive?

The purpose of screening is to separate the vital few factors that have a substantial effect on the response from the trivial many that have negligible effects. If a factor’s effect is strongly curved, a traditional screening design may miss this effect and screen out that factor. If there is a two-factor interaction, standard screening designs having a similar number of runs to the definitive screening design with the same number of factors will require follow-up experimentation to resolve the ambiguity. The definitive screening design can reliably accomplish the task of screening even if there are a couple of second order effects.

Try the definitive screening designs add-in available from the JMP File Exchange (download requires a free SAS profile). The add-in works with JMP 9 and JMP 10.

Article Labels

There are no labels assigned to this post.

Visitor

Paul wrote:

Bradley, your series of blogs on DOE have been instructive, thanks for sharing them. Your explanation of why these designs are deemed definitive for screening makes sense. The screening identifies active factors through 2nd order, so there's all the elements needed to build a response surface and perform optimization. The driving factors could be somewhat ambiguous if a 2-factor interaction is ID'd that is 2/3 correlated with other 2-factor interactions. Do you recommend using the effect heredity principle (an active 2-factor interaction will have at least one of its main effects also active) to resolve, if it 's applicable? Thanks, Paul

Staff

Paul,

The definitive screening design cannot fit all the 2nd order terms because there are not enough runs. However, if there are only a few strong 2nd order effects, then the design can resolve them without requiring further runs (except for confirmation). I do think that making use of the effect heredity principle is generally useful. It dramatically reduces the number of alternative models to fit. JMP 10 has a new feature in Stepwise regression for looking at all possible models up to a certain number of terms restricting the models to exhibit effect heredity. I would recommend using this feature with a full quadratic model as the set of terms to consider.

Visitor

Louis F Valente wrote:

These new definitive screening designs are awesome. I have already applied them twice in two of my latest efforts and the value provided using this methodology has been tremendously attractive and efficient. Bravo!

Lou

Visitor

Dale Kopas wrote:

Brad, excellent work on the definitive screening designs addin for JMP. I noticed when I ran a few simulations of designs after including the addin into JMP Version 10 with various numbers of factors (K) from 3 up to 7 it appears that they give 2K+1 designs plus two additional factorial runs without a middle level for an odd number of factors; but 2K+1 designs for an even number of factors? So for exapmple they are not returning a 2K+1 7 run design for 3 factors, an 11 run design for 5 factors, or a 15 run design for 7 factors. Instead you get a 9 run design for 3 factors, a 13 run design with 5 factors and a 17 run design with 7 factors (see 5 factor example below):

A B C D E Standard Order

0 1 1 1 1 1

0 -1 -1 -1 -1 2

1 0 -1 1 1 3

-1 0 1 -1 -1 4

1 -1 0 -1 1 5

-1 1 0 1 -1 6

1 1 -1 0 -1 7

-1 -1 1 0 1 8

1 1 1 -1 0 9

-1 -1 -1 1 0 10

1 -1 1 1 -1 11

-1 1 -1 -1 1 12

0 0 0 0 0 13

But for an even number of factors such as 4 factors it does return a 2K+1 9 run design, and for 6 factors it also returns a 2K+1 design of 13 runs.

I still have to read the JQT article in full to try to understand what is going on here in terms of extra runs required to estimate certain higher order effects perhaps? But for now it appears the designs returned for an odd number of factors does not match what is shown on page 5 of your JQT article. What do you think??

Staff

The Add-In makes use of some new information about definitive screening designs that was published in the January 2012 issue of JQT by Xiao et. al. These authors showed that by using a conference matrix and its foldover plus a center run, you could generate an orthogonal definitive screening design. The construction for a conference matrix is instantaneous, so you can build designs with a very large number of factors quickly. It seemed a good trade to add 2 extra runs to the designs with an odd number of factors in exchange for a design that is orthogonal for the main effects.

By the way, I would not use definitive screening designs for fewer than 5 factors. When you have 4 or fewer factors, you are really not in a screening situation. I should note that the 4 factor definitive screening design is a Graeco-Latin Square. At one point Stu Hunter wrote an article titled, "Let's all beware the Latin Square." In this article he points out that using these designs in industrial experiments is problematic because the second order effects, if they exist, are difficult to separate analytically.

I really like the definitive screening designs for 6 to 12 factors. If you have quantitative factors, they are hard to beat. For very large numbers of factors, I think these designs would be excellent for doing sensitivity analysis of computer codes.

Visitor

Dale Kopas wrote:

Thanks Brad. That all makes sense. I only saw the January 2011 JQT article. Will review the Xiao article as well. Regards, Dale.

Visitor

Jan M Pottinger wrote:

Thanks for the Blog. Extremely useful.

I have a seven variable process that is very second order and somewhat third order. Interactaction may be only strongly linear. Can I use a definitive model or should I be looking elsewhere? ,jmp

Visitor

Definitive screening designs in chemistry wow at Informex 2013 wrote:

[...] more information on definitive screening designs, you may wish to read Bradley Jones' blog postÂ introducing them or go right to Jones' co-authored and award-winning paper about [...]

Visitor

Kesav Reddy wrote:

Can blocking be used as a variable on these definitive screen designs?

Visitor

Kesav Reddy wrote:

especially now with the new categorical factor option in the JMP add-in

Staff

Chris Nachtsheim and I have been working on a research paper describing how to create orthogonally blocked definitive screening designs. The short answer is that you can create 2 to k orthogonal blocks where k is the number of factors. You do this by assuring that each block is composed of foldover pairs. To maintain the ability to fit all the quadratic effects assuming fixed blocks add a center run to each block.

Staff

As long as you view and implement DOE as a sequential process, the definitive screening design would be a reasonable place to start for your situation. If you really do have 3rd order effects, you will need to augment this design to resolve these.

Visitor

Kesav Reddy wrote:

thank you

Visitor

Kesav Reddy wrote:

I hope I understood it correctly (Blocks A and B)

0 1 1 1 1 A 1

0 -1 -1 -1 -1 A 2

1 0 -1 1 1 A 3

-1 0 1 -1 -1 A 4

1 -1 0 -1 1 A 5

-1 1 0 1 -1 A 6

0 0 0 0 0 A 7

1 1 -1 0 -1 B 8

-1 -1 1 0 1 B 9

1 1 1 -1 0 B 10

-1 -1 -1 1 0 B 11

1 -1 1 1 -1 B 12

-1 1 -1 -1 1 B 13

0 0 0 0 0 B 14

Visitor

Al Annamalai wrote:

This is an excellent design with superior properties.

Though you have clearly articulated the purpose of the definitive screening design, some users want to use this design as one time experiment similar to response surface modeling, as this allows estimating quadratic terms. Screening makes sense when there are a large number of factors. For small number of factors, some want to use definitive screening design, in the place of a response surface design. I would like to get your views on this. In this use, though the resulting model will not have two factor interactions, the designâ s ability to estimate main and quadratic effects is desired. Can one use the resulting model, i.e., without two factor interaction terms, and run some confirmatory experiments for some factor combinations, i.e., taking the chance on the interaction effects? Thank you.

Visitor

The experiment you entered above has 2 blocks labeled A and B. They are orthogonal to the main effects. So, yes, you did it correctly.

I should point out that though the blocks are orthogonal to the main effects of the factors, they are not orthogonal to the two-factor interactions.

Visitor

Al,

I wrote in an earlier reply that I don't recommend using these designs for fewer than 5 factors. For 4 factors, you could use the 6 factor design and drop two columns. For 2 or 3 factors you could also use the 6 factor DSD and drop 3 or 4 columns respectively.

The resulting designs are around 90% efficient compared to the D-optimal design for the same number of runs.

Again, I do not advocate using DSDs for fewer than 5 factors. However, if you have 2 or 3 factors and you use the 6 factor design as described above, the result is a reasonable response surface design.

Visitor

guido.desmarets wrote:

Bradley, I discovered lately your new invention. This DSD looks like the Holy Grail of the screening designs. Congratulations with the work you have been doing.

I still have a question about the Heredity principle. I have come across a few situations where this principle is not valid, being visualized by nicely-crossed-line-interactions, and then we can miss the important interactions in our model as the main factors are not significant. Is there anything which can be done on that?

regards

Guido

Staff

You can include all the two-factor interactions and quadratic effects as potential to your model. Then, you can do stepwise starting with all the main effects in the model. A big two-factor interaction (crossed lines) will be obvious even if the associated factors do not have large main effects.

Visitor

Greg Steeno wrote:

Just took yours and Chris' class at FTC. It was excellent. I have a much better appreciate of DSDs after that.

Quick question. Do DSDs makes sense for (linearly) constrained experimental regions? I assume not, as some desired orthogonality goes out the door due to the inherent factor constraints.

- Greg

Staff

Greg,

I am glad you liked the course.

To answer your question, since there is a zero in every row of the DSD, you automatically cut off the most extreme points in the number of factors you are considering. For instance, suppose you have 6 factors coded -1 to 1. Then constraints of the form X1 +/- X2 +/- X3 +/- X4 +/- X5 +/- X6 <= 5 work.

Linear constraints involving a subset of the factors will remove points from the DSD and affect the statistical properties.

Level I

I really like the definitive screening design in JMP and I ran into a similar concern as discussed in this post. It's my understanding five factors design is still in a grey area and it is exactly my situation. I tried five factors design only and six factors design with an additional column, both gave me the same amount of  13 runs and basically the same design except for the additional column that's supposed to be dropped. Do they also work the same in the design evaluation after the experiment results are gathered?

Thank you for your response on how to properly use DSD with 4 factors, but the part I don't understand is how to drop the two columns. Do you mean simply ignore them? Maybe my questions about 4 or 5 factors design are caused by the misunderstanding about DSD. I would really appreciate it if you can give me some help on this.

Thanks,

Fan