Subscribe Bookmark RSS Feed

Stats Grouping Question

kmohajer

Community Trekker

Joined:

Jan 6, 2015

I am working with a 2x2 factorial and the amount of data I gathered for each individual group is 10, 10, 18, and 36. Statistically, would it provide better results if we cut each bin down to 10 results (so keep them all constant) or should I work with all of the data, regardless of the fact that one bin has much more than the rest?

1 REPLY
alexw

Community Trekker

Joined:

Apr 25, 2014

Use the whole dataset. If you cut your bigger groups down to 10 results, you would need to decide how you would sample your groups, and however you do it, you will introduce an element of approximation. While equal group sizes would be preferable in order to get results that are easier to interpret and understand, there is nothing wrong per se with using unequal group sizes. Just bear in mind that the fitted and predicted values that you get out from your factorial design will almost certainly have larger confidence/prediction intervals around the groups with n = 10 than around the groups with n = 18 or 36. In other words, your model will fit better in some places than in others.