BookmarkSubscribeRSS Feed
Choose Language Hide Translation Bar
Highlighted
Liz_Perkin
Community Trekker

Testing - Sample sizes for a full factorial

 

Hi,

I’d really like some help with sample sizes for a full factorial design for a Marketing campaign I’m involved in - looking at response rate. I want to test 4 things, all have 2 levels, cannot be changed and are categorical. If I were to conduct 4 separate A/B tests the cell volumes would look like this:

 

Control response rate = 0.6%

 

 

 

 

 

 

 

Success

alpha

Power

Sides

Volume/cell

Total volume

 

Test 1 (A/B)

50% uplift

0.8

0.2

1

4687

9374

 

Test 2 (A/B)

10% uplift

0.8

0.2

2

156606

313212

 

Test 3 (A/B)

10% uplift

0.8

0.2

2

156606

313212

 

Test 4 (A/B)

10% uplift

0.8

0.2

2

156606

313212

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Total:

949010

 

 

But I’m keen to look at the interaction effects (only at 2* level) between the factors. I would like to set up a 16 cell full factorial DOE but am unsure of what sample sizes to use in each of the 16 cells. Any help would be greatly appreciated.

Let me know if any more information is required

Thanks, LP

0 Kudos
4 REPLIES 4

Re: Testing - Sample sizes for a full factorial

Hi Liz,

 

The procedure for determining power estimates in JMP Pro for a DOE with a binomial response as in your case has a number of steps as follows:

 

  • Design your experiment in the normal way and select simulate responses from the DOE hot spot and specify the coefficients for your simulated model, e.g. I used the following coefficients to explore effects equal to the size of random noise, 1.5 times the size of random noise, 2 times the size of random noise and so on (see screen shot attached)
  • Click make table which gives one simulated scenario based on 100 trials in the case of the example provided (or asking 100 people).
  • Run a generalised regression model using binomial response and model including main effects and 2-factor interaction effects (see scripts in attached table called power analysis with binomila response)
  • In the GenReg report, right-mouse click over the p-value column and select simulation.
  • In the resulting simulation table run the power analysis script and examine the lower 95% confidence interval under simulated power for each term in your model. This gives you the minimum anticipated power for each effect in your model for various alpha rates (type I error or significance level). An example is attached in the table called Generalized Regression Simulate Results (Prob > ChiSquare) n trials = 500
  • You may need to run the power scenario using several different number of trials for each row in your design matrix. This is easily done by copying and editing the column formula in the first table. For illustration purposes, I’ve given scenarios for n=100, n=250 and n=500 trials (or people surveyed) per row in the DOE matrix.
  • Using the power for each of these scenarios you can make a decision about the number of trials (people surveyed) per row in your experimental design to give acceptable power and type I error trade-off.

 

Best,

 

Malcolm

Liz_Perkin
Community Trekker

Re: Testing - Sample sizes for a full factorial

Hi Malcolm,

 

Firstly thankyou so much for your response. I spent some time with my colleague yesterday going through your example and we are pleased with our learnings.

One part I think we need to understand a little better is around the similated responses for the effects. (as in the "example similation" jpeg you have attached)

I think as we have some prior knowledge of expected response rate and uplift that we can use these as conditions to similate Y?

Base expected response rate = 0.6%

Success for factor 1 = 50% uplift

Success for factor 2, 3, and 4 = 10% uplift

 

I'm thinking the intercept to be  IN(0.006/0.994) = -5.12 but not sure of X1 - X4, and the interaction effects

Any further help would be great

Thanks Liz  

0 Kudos
phil_kay
Staff

Re: Testing - Sample sizes for a full factorial

The first thing to say is that all of these estimates are in log odds, which are a bit strange.

 

So the baseline response (or “intercept”) of 0.6% is a probability, p, of 0.006. As you said, you can convert to log odds like this:

 

Log odds = ln( p / 1-p ) = ln(0.006 / 0.994) = -5.11

 

 

Next, an uplift of 10% means that the response rate is 10% higher for level 1 versus level 2 (or vice versa). This is the same as saying that it is an uplift of 5% versus the average response for both levels of the factor.

 

So a response rate of 0.63% for level 1 versus 0.57% for level 2.

 

Which means p = 0.0063 versus p = 0.0057.

 

Which means log odds for colour 1 = ln( 0.0063 / 1-0.0063 ) = -5.06

 

Finally the difference in log odds for level 1 versus the intercept is -5.06 - -5.12 = 0.0491

 

 

It is a bit confusing because of our definition of the intercept. We define the effect for level 1 as the change in response rate (in log odds) versus the baseline response rate, which is the response rate averaged across level 1 and level 2.

 

In other software the intercept is defined as the response for one level versus the other level.

Liz_Perkin
Community Trekker

Re: Testing - Sample sizes for a full factorial

Hi Phil,

 

Thanks so much for this - I feel like I'm nearly there. For the 1 level factors and cell size of 15,000 (I have just found out that that is the maximum I can have from a

budget perspective) I get a power of 0.72-0.76. I was hoping it to be >0.8 but I thing I can get comfortable given this is a marketing campaign (and not a manufacturing process).

The last part is the interaction effects. I've just  put in 0.0491 for the estimates of these also (and get even lower power) - but I don't think this is correct.

Any help would be great

 

Thanks again for everything so far

 

Liz

0 Kudos