Hi, I'm getting ready to begin my master's program and want to be as prepared as possible in designing, implementing and analyzing my experiment(s). I'll be researching the biological degradation of a carcinogenic substance known as aflatoxin B1. Due to the nature of my experiment, I have one hard to change factor and will be working with a split-plot design. I have generated numerous designs using different numbers of runs, whole plots and levels of each factor; the goal being to maximize power and minimize variance. I have seen in literature, and was given guidance, that each of my factors should have a power >0.8 in a biological experiment and have been evaluating the quality of my designs partly based on that information. Anyhow, I have generated a feasible design where my whole plot factor power is 0.923 and all other factors up to 2nd level interactions are 1.
My concern lies with the VIF = 12.37 of the whole plot factor (all other VIF's for subplot factors and 2nd level interactions are between 1 and 1.72). In other designs that I've created the VIF for the whole plot factor has been between 15 and 45. From the literature that I've encountered (which is limited by all accounts), the highest acceptable VIF is 10, indicating a multicolinearity of 0.1. Does this expectation hold true in split-plot designs, or is it understood that the VIF will be higher? Is this compensated for by using the GLS vs. OLS approach, and is that automatically detected in JMP when fitting Y by X? Any help is appreciated!
Wood in Substrate
Inoculated w/ P. ostreatus
In a split plot, under a null of no differences in means, you would have perfect collinearity between whole plot means. Personally, I don't think examination of the VIF for whole plots in a split plot design is meaningful.
Thanks, Steve. Do you know of any literature that would support this idea, or is it assumed to be common knowledge that I'm not making sense of because of my lack of experience? You can probably understand that I want to have a definitive defense in case this comes into question by my graduate committee. Are you saying that there is always a perfect collinearity between whole plot means, or in my particular case? I'm working hard to gain a better understanding of these concepts and am frustrated with myself that I don't have mastery of them. I'm reading books and articles (Goos, Jones, Montgomery, etc.) and watching statistic and biostatistic lectures incessantly. I've only had the opportunity to get one basic biostats class under my belt, where the most advanced concept was one-way ANOVA's and we didn't even separate the means. Excuses, excuses, but I apologize for not completely getting this yet.
No problem, and I want to be clearer than I was. First, I just looked at the formula for calculating VIF and noticed that there was an automatic dependence if the null were true (all subplot means would always predict perfectly the whole plot means). Thus, I made the leap of faith (not always a good idea, especially with graduate committees) that large VIF's would be expected due to the design.
Now on to another question (again just like qualifying exams). Are the whole plots, the 16 chambers, all at the same temperature, or 8 at 2, or 4 at 4? The reason I ask is that if they are all set to the same temperature, you should probably treat the whole plot as a random effect, and then VIF is kind of meaningless.
They are at different temperatures. The way the experiment was designed, I have 6 incubators at 20C, 3 at 24C and, 7 at 28C. It seems like it would make more sense to have 6 at 20C, 4 at 24C and 6 at 28C, but that isn't how JMP generated it. Anyhow, enzymes operate more or less efficiently at different temps and we want to know if this will alter the capacity to degrade the substance in question. If they were all the same, I would have gone for the full factorial design and would have been able to drastically reduce the number of runs needed.
The 6:3:7 is more efficient than the 6:4:6 arrangement, as it is closer to the 2:1:2 of a center point design. Anyway, this would explain why the VIF for the whole plot is on the order of 10 to 50, rather than infinity (which would be my expectation if there were no whole plot treatment levels). Subplot means go a long way toward explaining the variability in whole plot means in your data.