turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Blogs
- :
- JMP Blog
- :
- A surprising connection between Sudoku and design ...

Article Options

- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Email to a Friend
- Printer Friendly Page
- Report Inappropriate Content

Jan 16, 2017 10:08 AM
(2788 views)

A surprising connection between Sudoku and design of experiments

Sudoku is a very popular puzzle and pastime. What may come as a surprise is that a completed Sudoku can be viewed as a kind of designed experiment. The picture below is of a JMP data table showing a completed Sudoku.

A Sudoku has a very interesting structure. It is a 9-by-9 grid of numbers from 1 to 9 where every row and column has all nine numbers. A Latin Square design has these same elements.

I have colored nine smaller 3-by-3 cells, each of which also contains all nine numbers. So, a Sudoku is a very special case of a Latin Square design. Most Latin Square designs do not have this extra feature.

Ronald Fisher invented these designs in the 1930s for agricultural experiments. Imagine dividing a large square field into a 9-by-9 grid of smaller plots. Suppose that you wanted to test nine different treatments on some crop to see which one resulted in the best yield. A Latin Square design allows the experimenter to independently estimate the row effect, the column effect, and the treatment effect. For this scenario, the Latin Square Design is the most efficient design possible in the sense of minimizing the variance of all these estimates.

You can imagine that there might be some fertility gradient across the field. The row and column effects can adjust the treatment effect estimate for any fertility gradient effect. However, suppose the northeast corner of the large field (colored light green above) had some unique damage so that the yield in that whole 3-by-3 grid was zero. Losing that 3-by-3 square would still leave eight squares, each employing all nine treatments. In a standard Latin Square design, the result of losing this square generally would be some lack of balance in the number of replications of each treatment.

More importantly, an assumption behind using a Latin Square design is that the row and column factors only have additive effects (i.e., they do not have a two-factor interaction). The Sudoku design can provide a kind of test of this assumption. You could do this by comparing each cell’s predicted value with the average of the predicted values of the row and column entries making up the cell.

Sadly, no. The cell effects are confounded with the row and column effects if we treat all these effects as *fixed* effects. However, if we treat the row and column effects as *random* effects, we can estimate the variance of these effects as well as the fixed cell and treatment effects. We call a design with two crossed random effects like this a *strip-plot* design.

It depends…

If the goal is to minimize the variances of the treatment effects, while maintaining the meaning of a cell in terms of its location, then the answer is, yes.

However, if you wanted to estimate both the cell effects and the treatment effects with maximum efficiency, there is a better design. I constructed a D-optimal strip-plot design using a Covering Array design – a feature that was new in JMP 12.

The figure below shows the relative standard errors of the estimates of the Sudoku to the D-optimal design for estimating the cell effects (Cell 1 through Cell 8) and the treatment effects (designated N 1 through N 8) using the new Compare Designs tool on the DOE menu in JMP 13.

You can see that the standard errors of the cell effect estimates for the Sudoku design are at least three times larger than for the D-optimal design.

The D-optimal design accomplishes this by removing the interpretation of a cell in terms of its location as a group of rows and columns. In agricultural experiments, this would be undesirable.

In an industrial setting with two crossed random blocking factors, the D-optimal design could be very useful.

Article Tags

You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.