Testing software preferences with a covering array
Jun 5, 2015 8:51 AM
In any computer software, it’s not unusual to have a set of preferences to allow customization to the settings. JMP is no exception. For example, in JMP 12, the Categorical platform allows for a great deal of the output to be tailored to the user’s preferences. Here’s what we see under Preferences for the Categorical platform:
There are a number of check boxes, drop-down options and editable number boxes. Imagine trying to test if there were certain combinations that would cause a failure in the software. Keeping the choices where numbers can vary to a minimum, we still have two options with four choices (levels), four options with three choices, and 39 binary options. If we were to test every possible combination, we have a whopping 4^2 * 3^4 * 2^39 = 712,483,534,798,848 combinations. Obviously, testing every possible combination is not feasible.
However, it turns out that faults in software are usually due to the interaction of just a few options. For example, if I wanted to test only two check boxes, I would have four cases to consider: both boxes checked, two cases with one box checked and one not, and both boxes unchecked. Following this idea, why not first consider all pairwise options? Even if I just wanted to check all of the two-option possibilities, a little math shows that I have 4,690 combinations to consider.
A different way to test?
Luckily, in JMP 12 we have covering arrays! They can help us derive test scenarios for testing software. With a covering array, if I specify strength 2, I know that all of the two-option combinations will be covered. I use DOE->Covering Array (JMP Pro only), and add factors corresponding to each option. Part of my factors setup is shown below, but I’ve included the Factors table on the File Exchange. You can load this under the Covering Array platform by having the factors table open and selecting Load Factors under the red triangle menu.
Keeping the Strength = 2 means that I want to ensure that all combinations of all pairs of options will be included somewhere in the design. When I chose Continue and then Make Design, I ended up with a 20-run design (your results may vary, but usually the run size will be in the range of 19-21 runs). It’s astounding to think that with the 45 options, all combinations of all possible pairs appear in just 20 runs.
Can I do it in fewer runs?
Since the two factors with the largest number of values have four levels each, the lower bound on the size of this strength 2 covering array is 16 runs. For this example, having four extra runs doesn’t seem like a big deal, but there are cases where a 20 percent savings in the number of runs is substantial in terms of time and money. It’s easy to try and find a smaller design using the Optimize button. Selecting 10,000 iterations, I was able to find the 16-run design on the File Exchange.
But wait, there’s more!
Let’s take a closer look at the Metrics from the Covering Array platform:
The coverage numbers represent the percentage of combinations involving t factors that are covered by the design (i.e., how many of the possible combinations appear somewhere together in the design). If the Categorical platform passes the set of tests defined by the covering array, we’ve covered all combinations from any pair of options, hence the 100 percent coverage for t=2. But, if everything passes, then we’ve also tested (and passed) over 85 percent of the combinations from three options, and 57 percent of those from four options.
This seems too good to be true
With all those options, do 16 runs really give me all that information? Keep in mind that I only know that all those combinations appear in the design -- if everything passes, we know that there are no faults due to two options together as well as many of the three- and four-option combinations.
What happens if I do see a failure? Then the covering array has helped us identify that there’s a problem, but not the direct cause of that problem. Can it help us find the cause of the failure? It turns out that we do have a lot of information contained in the covering array based on not just the failure, but also the successful tests. Next time, we’ll look at how it can help us to find a cause.