So I want to run a two factor analysis of variance with post-hoc test of my data (time and treatment). I have a data set for time zero (before treatment is applied - treatment has two levels and time has three).
How do fairly assign this data set to both treatments as it is representative of both at the start as no time has passed with treatment being administered. It seems like you would just put the same data in twice and assign it to each treatment individually, but this seems unfair to use the same data twice.
I am running a cell culture experiment. I am looking at the effects of oxygen tension (low vs high), cell seeding density (low vs high), and time in culture (0, 14, and 28 days) on DNA and collagen quantity.
So I started the experiment on day 0. I collected data (n=4) for low and high cell seeding density; this is two data sets (one for each seeding condition). Now as the experiment ran I collected data for each oxygen and seeding condition (4 groups per timepoint).
Okay how do I use this initial time zero data (2 groups) with subsequent data (4 groups - oxygen tension (low and high) and seeding density (low and high)).
Is it statistically correct to put the same data in the model twice as day 0 (once for low oxygen and again once for high for a given seeding density).
I didn't run a day 0 for each oxygen condition as the data for a given seeding density pertains to each oxygen level as the cells have spent no experimental time in any condition as experiment is just starting.
I specifically asked what your "subjects" (or experimental units) were in this study. You didn't answer. You say that at day zero, you collected N=4 somethings. N=4 what? What were you collecting data from? Was it 4 different petri dishes of cells?
Then you go on to say "I didn't run a day 0 for each oxygen condition as the data for a given seeding density pertains to each oxygen level as the cells have spent no experimental time in any condition as experiment is just starting." I can't understand this without knowing what the "subjects" were. And how each "subject" moved through time and experimental conditions. Of the 4 somethings on day zero, did you split them up so that on day 14, two of them got low oxygen tension and two of them got high oxygen tension? Please give us the details here.
At the beginning of the experiment I took cells and encapsulated them in alginate beads at a density of 1 or 3x10^6 cells/ml. Now these are the two starting conditions.
After this some of each (1 or 3x10^6 cells/ml) went into either 2 or 21 % oxygen environments. Day zero is the starting point and one data set for each seeding (1 or 3x10^6 cells/ml) is representative of all four conditions. Meaning that I can represent the starting point for 4 conditions with 2 datasets.
Condition 1 1x10^6 at 2%
Condition 2 1x10^6 at 21%
Condition 3 3x10^6 at 2%
Condition 4 3x10^6 at 21%
Both condition 1 and 2 can be described by the one DNA dataset at day zero as cells have just been put into alginate beads; they have spent no time in either oxygen condition. Day zero just serves as a starting point for both oxygen conditions at given seeding density.
I still don't know what the subject (or experimental units) are. I still don't understand how each subject is assigned to a treatment (or what you call a condition).
You could say ... "experimental units" are "petri dishes", or cells or whatever. Simple, clear, and it answers my question clearly. From there, you could explain how each subject is assigned to its time zero treatments, and how each subject is assigned to future treatments. Maybe even provide a table... I don't really care about DNA, I care about the experimental design.
Maybe the problem is that I just don't understand you, and its possible that this is perfectly clear to someone else. But without a clearer understanding of the experimental design, and how subjects are assigned to treatments, I have to bow out of this discussion, as I can't help.
your design is pretty common in assays, eg see European Pharmacopoeia. you can't add the time zero units twice. Why not use time as a continuous variable and set up a regression slopes model. That way time zero serves as intercept for both treatments. Or you can use Dunnett's to compare each level against time zero if that's your interest. just soem ideas. Cheers