Choose Language Hide Translation Bar
Highlighted

## Testing - Sample sizes for a full factorial

Hi,

I’d really like some help with sample sizes for a full factorial design for a Marketing campaign I’m involved in - looking at response rate. I want to test 4 things, all have 2 levels, cannot be changed and are categorical. If I were to conduct 4 separate A/B tests the cell volumes would look like this:

 Control response rate = 0.6% Success alpha Power Sides Volume/cell Total volume Test 1 (A/B) 50% uplift 0.8 0.2 1 4687 9374 Test 2 (A/B) 10% uplift 0.8 0.2 2 156606 313212 Test 3 (A/B) 10% uplift 0.8 0.2 2 156606 313212 Test 4 (A/B) 10% uplift 0.8 0.2 2 156606 313212 Total: 949010

But I’m keen to look at the interaction effects (only at 2* level) between the factors. I would like to set up a 16 cell full factorial DOE but am unsure of what sample sizes to use in each of the 16 cells. Any help would be greatly appreciated.

Thanks, LP

4 REPLIES 4

## Re: Testing - Sample sizes for a full factorial

Hi Liz,

The procedure for determining power estimates in JMP Pro for a DOE with a binomial response as in your case has a number of steps as follows:

• Design your experiment in the normal way and select simulate responses from the DOE hot spot and specify the coefficients for your simulated model, e.g. I used the following coefficients to explore effects equal to the size of random noise, 1.5 times the size of random noise, 2 times the size of random noise and so on (see screen shot attached)
• Click make table which gives one simulated scenario based on 100 trials in the case of the example provided (or asking 100 people).
• Run a generalised regression model using binomial response and model including main effects and 2-factor interaction effects (see scripts in attached table called power analysis with binomila response)
• In the GenReg report, right-mouse click over the p-value column and select simulation.
• In the resulting simulation table run the power analysis script and examine the lower 95% confidence interval under simulated power for each term in your model. This gives you the minimum anticipated power for each effect in your model for various alpha rates (type I error or significance level). An example is attached in the table called Generalized Regression Simulate Results (Prob > ChiSquare) n trials = 500
• You may need to run the power scenario using several different number of trials for each row in your design matrix. This is easily done by copying and editing the column formula in the first table. For illustration purposes, I’ve given scenarios for n=100, n=250 and n=500 trials (or people surveyed) per row in the DOE matrix.
• Using the power for each of these scenarios you can make a decision about the number of trials (people surveyed) per row in your experimental design to give acceptable power and type I error trade-off.

Best,

Malcolm

## Re: Testing - Sample sizes for a full factorial

Hi Malcolm,

Firstly thankyou so much for your response. I spent some time with my colleague yesterday going through your example and we are pleased with our learnings.

One part I think we need to understand a little better is around the similated responses for the effects. (as in the "example similation" jpeg you have attached)

I think as we have some prior knowledge of expected response rate and uplift that we can use these as conditions to similate Y?

Base expected response rate = 0.6%

Success for factor 1 = 50% uplift

Success for factor 2, 3, and 4 = 10% uplift

I'm thinking the intercept to be  IN(0.006/0.994) = -5.12 but not sure of X1 - X4, and the interaction effects

Any further help would be great

Thanks Liz

## Re: Testing - Sample sizes for a full factorial

The first thing to say is that all of these estimates are in log odds, which are a bit strange.

So the baseline response (or “intercept”) of 0.6% is a probability, p, of 0.006. As you said, you can convert to log odds like this:

Log odds = ln( p / 1-p ) = ln(0.006 / 0.994) = -5.11

Next, an uplift of 10% means that the response rate is 10% higher for level 1 versus level 2 (or vice versa). This is the same as saying that it is an uplift of 5% versus the average response for both levels of the factor.

So a response rate of 0.63% for level 1 versus 0.57% for level 2.

Which means p = 0.0063 versus p = 0.0057.

Which means log odds for colour 1 = ln( 0.0063 / 1-0.0063 ) = -5.06

Finally the difference in log odds for level 1 versus the intercept is -5.06 - -5.12 = 0.0491

It is a bit confusing because of our definition of the intercept. We define the effect for level 1 as the change in response rate (in log odds) versus the baseline response rate, which is the response rate averaged across level 1 and level 2.

In other software the intercept is defined as the response for one level versus the other level.

## Re: Testing - Sample sizes for a full factorial

Hi Phil,

Thanks so much for this - I feel like I'm nearly there. For the 1 level factors and cell size of 15,000 (I have just found out that that is the maximum I can have from a

budget perspective) I get a power of 0.72-0.76. I was hoping it to be >0.8 but I thing I can get comfortable given this is a marketing campaign (and not a manufacturing process).

The last part is the interaction effects. I've just  put in 0.0491 for the estimates of these also (and get even lower power) - but I don't think this is correct.

Any help would be great

Thanks again for everything so far

Liz