I’d really like some help with sample sizes for a full factorial design for a Marketing campaign I’m involved in - looking at response rate. I want to test 4 things, all have 2 levels, cannot be changed and are categorical. If I were to conduct 4 separate A/B tests the cell volumes would look like this:
Control response rate = 0.6%
Test 1 (A/B)
Test 2 (A/B)
Test 3 (A/B)
Test 4 (A/B)
But I’m keen to look at the interaction effects (only at 2* level) between the factors. I would like to set up a 16 cell full factorial DOE but am unsure of what sample sizes to use in each of the 16 cells. Any help would be greatly appreciated.
Let me know if any more information is required
The procedure for determining power estimates in JMP Pro for a DOE with a binomial response as in your case has a number of steps as follows:
Firstly thankyou so much for your response. I spent some time with my colleague yesterday going through your example and we are pleased with our learnings.
One part I think we need to understand a little better is around the similated responses for the effects. (as in the "example similation" jpeg you have attached)
I think as we have some prior knowledge of expected response rate and uplift that we can use these as conditions to similate Y?
Base expected response rate = 0.6%
Success for factor 1 = 50% uplift
Success for factor 2, 3, and 4 = 10% uplift
I'm thinking the intercept to be IN(0.006/0.994) = -5.12 but not sure of X1 - X4, and the interaction effects
Any further help would be great
The first thing to say is that all of these estimates are in log odds, which are a bit strange.
So the baseline response (or “intercept”) of 0.6% is a probability, p, of 0.006. As you said, you can convert to log odds like this:
Log odds = ln( p / 1-p ) = ln(0.006 / 0.994) = -5.11
Next, an uplift of 10% means that the response rate is 10% higher for level 1 versus level 2 (or vice versa). This is the same as saying that it is an uplift of 5% versus the average response for both levels of the factor.
So a response rate of 0.63% for level 1 versus 0.57% for level 2.
Which means p = 0.0063 versus p = 0.0057.
Which means log odds for colour 1 = ln( 0.0063 / 1-0.0063 ) = -5.06
Finally the difference in log odds for level 1 versus the intercept is -5.06 - -5.12 = 0.0491
It is a bit confusing because of our definition of the intercept. We define the effect for level 1 as the change in response rate (in log odds) versus the baseline response rate, which is the response rate averaged across level 1 and level 2.
In other software the intercept is defined as the response for one level versus the other level.
Thanks so much for this - I feel like I'm nearly there. For the 1 level factors and cell size of 15,000 (I have just found out that that is the maximum I can have from a
budget perspective) I get a power of 0.72-0.76. I was hoping it to be >0.8 but I thing I can get comfortable given this is a marketing campaign (and not a manufacturing process).
The last part is the interaction effects. I've just put in 0.0491 for the estimates of these also (and get even lower power) - but I don't think this is correct.
Any help would be great
Thanks again for everything so far