Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Staff
Optimal Design of the Choice Experiment

My previous blog post covered issues in the design of a choice experiment for laptop computers. The goal was to model the trade-offs among features and price. In this post, I'll show how to design a choice experiment.

The Choice Design feature, which you access from the DOE menu in JMP 8, designs choice experiments. This platform was developed by Bradley Jones, with help from Chris Gotwalt.

The first job is to enter the factors in the experiment. After adding the factors and specifying the levels, the window looks like this:

Next, we specify the model. This has to be a small experiment, so we just take the default main-effect model.

Next, we fill in the Prior Specification. Remember from my previous post that the optimal design depends on what the answer is, and we don’t know the answer. Actually, we already know a lot about the choices. We already know that people want large disks, higher speeds, longer battery life and lower price.

The experiment measures the relative strengths of these characteristics; it measures trade-offs, particularly the trade-off between price and features. The response is in the positive direction, utility. Notice that all the factor levels are ordered so that the least desirable levels are first and the most desirable levels are last.

Now we can tell the designer that we know the direction of these levels. We do this by entering a prior mean. We say that 80 Gig (GB) is worth 1 utility unit more than 40 Gig. We say that 2.0 GHz CPU is worth 1 utility unit more than 1.5 GHz, etc. Of course we don’t really know the magnitude of these, and the uncertainty of that is expressed in the Prior Variance Matrix, with 1s on the diagonals. The convention is that if the first level is less desirable, then you enter a negative value, as we do here. When there are three levels in increasing utility order, enter negative, then 0. Actually, it doesn’t matter whether we enter the levels in the right order for the parameterization as long as the ordering is consistent across levels.

This Prior Specification is important in experiments like this, where the factors all have known preference directions and the goal is to measure trade-offs. If we didn’t specify this, then we could easily get choice-set items where one choice included all of the better factor levels and the other choice included all of the worse factor levels; in such a case, the choice response would be trivially obvious, and the run would be wasted.

Now we specify the rest of the experiment we want:

Suppose we have 16 subjects lined up to take the choice survey. We figure that each subject has the patience to do six comparisons. Each choice set will be two profiles — we could ask people to choose among more, but that is more work for the subject — two is standard. We choose to do two survey sets. This is a compromise between giving everyone the same questions and giving everyone his or her own separate survey with separately designed choice sets. The total number of subjects is the product of the last two specifications (2*8=16). The total number of choice responses is the product of the last three specifications (6*2*8=96).

Now there are two levels of design data here. There are the profiles that go into making each choice set. There are two profiles per choice set times six choice sets per survey times two surveys, making a table of (2*6*2) 24 unique choice profiles.

This structures the factor-level data so that you can prepare the raw material for the survey.

Then there is the subject-level data for the responses, showing which subjects get which survey and having a slot to enter the response for each choice trial. Here are the rows for the first two subjects. The first subject is taking Survey 1, and the second subject is taking Survey 2.

The Choice1 and Choice2 values index the Choice ID value in the Profiles table that matches the Choice Set ID. For example, in row 10, Choice1 is Choice ID 1 for Choice Set 10 in Survey 2, which is Row 19 in the Profile table (80 Gig, 1.5 GHz, 4 hours, \$1,000), where the other choice is the next profile in Row 20 (40 Gig, 2.0 GHz, 4 hours, \$1,500).

Why have two tables instead of one? It turns out that you have a choice of one table or two.

Let’s see whether this design follows the guidelines. Every choice must be a trade-off of desirable alternatives:

 Survey Choice Set Choice ID hard disk speed battery life price 1 1 1 40 Gig 1.5 GHz 6 hours \$1,500 1 1 2 40 Gig 2.0 GHz 4 hours \$1,200

This tests whether you are willing to pay \$300 more to get two more hours of battery life even if you also have to sacrifice speed. Trade-off of \$300 and speed for battery life.

 Survey Choice Set Choice ID hard disk speed battery life price 1 2 1 40 Gig 1.5 GHz 6 hours \$1,000 1 2 2 80 Gig 1.5 GHz 4 hours \$1,500
Trade-off of \$500 and battery life against hard disk.

 Survey Choice Set Choice ID hard disk speed battery life price 1 3 1 80 Gig 1.5 GHz 6 hours \$1,200 1 3 2 40 Gig 2.0 GHz 4 hours \$1,000
Trade-off of \$200 and speed for disk and battery life.

 Survey Choice Set Choice ID hard disk speed battery life price 1 4 1 80 Gig 1.5 GHz 4 hours \$1,500 1 4 2 40 Gig 1.5 GHz 4 hours \$1,200

 Survey Choice Set Choice ID hard disk speed battery life price 1 5 1 80 Gig 2.0 GHz 4 hours \$1,200 1 5 2 40 Gig 1.5 GHz 6 hours \$1,000
Trade-off of \$300 and battery for speed and disk.

 Survey Choice Set Choice ID hard disk speed battery life price 1 6 1 80 Gig 2.0 GHz 4 hours \$1,500 1 6 2 80 Gig 1.5 GHz 6 hours \$1,200
Trade-off of \$300 and battery for speed.

 Survey Choice Set Choice ID hard disk speed battery life price 2 7 1 80 Gig 2.0 GHz 6 hours \$1,200 2 7 2 40 Gig 1.5 GHz 4 hours \$1,000
Trade-off of \$200 for disk, speed and battery life.

 Survey Choice Set Choice ID hard disk speed battery life price 2 8 1 40 Gig 2.0 GHz 6 hours \$1,200 2 8 2 80 Gig 1.5 GHz 4 hours \$1,000
Trade-off of \$200 and disk for speed and battery.

 Survey Choice Set Choice ID hard disk speed battery life price 2 9 1 40 Gig 1.5 GHz 6 hours \$1,500 2 9 2 40 Gig 2.0 GHz 4 hours \$1,000
Trade-off of \$500 and speed for battery.

 Survey Choice Set Choice ID hard disk speed battery life price 2 10 1 80 Gig 1.5 GHz 4 hours \$1,000 2 10 2 40 Gig 2.0 GHz 4 hours \$1,500
Trade-off of \$500 and disk for speed.

 Survey Choice Set Choice ID hard disk speed battery life price 2 11 1 80 Gig 1.5 GHz 4 hours \$1,200 2 11 2 40 Gig 2.0 GHz 6 hours \$1,500
Trade-off of \$300 and disk for speed and battery.

 Survey Choice Set Choice ID hard disk speed battery life price 2 12 1 40 Gig 1.5 GHz 4 hours \$1,000 2 12 2 80 Gig 2.0 GHz 4 hours \$1,500
Trade-off of \$500 for disk and speed.

Are there any degenerate choices (i.e., where the choices are equal)? No. That's good.

For each factor, do we have choices where that factor is constant (so that a dominant factor can’t prevent the other factors from being measured)? Well, no. Price is always different in each choice set, so if price is totally dominant, we can’t measure other effects. If this is a concern, then we need to go back to the Design Generation field and change 4 to 3 in “Number of attributes that can change within a choice set.”

How about the polarity question? Polar factors should always have a mixture of polarity. That means the trade-offs should always be meaningful, not just all-good versus all-bad. This is where the Prior Specification works well. All of the choices are working pretty hard to measure values of interest. No choice is uninteresting.

Now we have an experimental design. Thanks to Brad Jones for this example.

Article Labels

There are no labels assigned to this post.

Article Tags
Visitor

inge Liekens wrote:

How Can you bring in a op out scenario so e.g. neither of both choices, so one where the attributes are always fixed?

Visitor

Charles wrote: