Multiple means comparision for non-parametric data?
Nov 17, 2009 12:19 PM(1543 views)
I am analyzing some data that is so strongly left-skewed that it has to be analyzed nonparametrically. Although my preferred question would be to see whether three wasp species differ in their preferences among three baits (a factorial design would work great if the data could be normalized), I think I’d settle for the Kruskal-Wallis analysis for each species separately IF I could then do the multiple pairwise comparisons, but I'm not sure I can even do that. One could do multiple Wilcoxon tests with Bonferroni corrections, for example. Is this possible with JMP? I don’t see an obvious way to set it up.
Of course, we're not actually talking about multiple MEANS comparison for a nonparametric data set, but simply multiple comparisons among treatment levels. What I've done so far is create a new table that sorts by the independent variable of interest (bait type), exclude rows with one bait type, then do a Y by X comparison (nonparametric) By species. Repeat until all three bait comparisons are made. Clunky, but it works. I'd still much prefer learning a better way, if JMP provides one!
I don't think there is a built in test for a two-way non-parametric test in JMP.
However, an quicker alternative to your approach (repeated Kruskal-Wallis tests on subsets of data) would be to add bait type to the "By-field" in the Fit Y by X dialog.
One rather robust alternative is two perform an ordinary two-way anova (Fit Model platform) on the global ranks. Ranks can be obtained by sorting your data and by using the fill command (type 1 and 2 in first two rows, select them and right click!) in a new column. There may also be a direct method to save ranks to a table that I am not aware of.
Thanks for the tips. I don't think I can include bait in the By field since it's already a dependent variable (I tried anyways, and was told a variable can't be both).
I'll have to try the global ranks approach next. My instincts are these may still not be normalizable data, but there's only one way to find out! I also would need the direct ranking method, since I have beaucoups of ties (nearly 1100 values that range from 0 to 31!).
Ok, I misunderstood how you had organized your data. And a lot of ties may be a problem for the Anova approach.
How are the wasp bait preferences represented? As proportions or frequencies? Or is each individual assigned just one dominating bait type? Although I still not know exactly what you are trying to do, I was thinking that you may use the logistic platform or a contingency table to look for differences instead of Kruskal-Wallis.
I'm not entirely comfortable with how I've organized the data. I try to make it so that each row represents a single experimental subject, but in this case each row represents a single trap-species combination. That is, if trap #123 has 9 yellowjackets and 5 hornets, then this represents two rows, one for the 'jackets and another for the hornets. Thus, number of wasps is one variable, and species identity is another. To me it makes more sense conceptually to have one row per trap, with each species counts in its own column, but I don't think you can then compare the different species to each other (e.g., regarding bait types).