cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar

How to transform data to ranks?

Hello,
i want to compare the degree of defoliation of different station. so i sampled three trees per station and for each tree i sampled at the bottom, at the middle and at the top of the crown. i wanted to used a two-way ANOVA to see whether there is a difference between station and between the level in the tree. but my problem is that the variances are highly unequal.
so i was wondering how i could do a rank transformation on my data using jmp? then i would do a regular two-way ANOVA.

Thanks in advance

Adrien

1 ACCEPTED SOLUTION

Accepted Solutions
sseligman
Staff

Re: How to transform data to ranks?

A quick way to transform your data to ranks is to use the Distribution platform. First, go to Analyze > Distribution and place your variable of interest in the Y role. If you have a grouping variable, place it in the By role. Click OK. Next, click on the red triangle drop down menu next to your response (if you have a By variable press and hold the Ctrl key beforehand to broadcast the action across group levels) and choose Save > Ranks or Ranks averaged. This method will save the ranks to your original data table.

View solution in original post

6 REPLIES 6
mfisher
Level III

Re: rank transformation

Adrien,

I'd sort your data by defoliation, then create a new column named "rank", and apply a formula to this column--the row function "sequence" will number the rows in this column from 1 to whatever you specify, adn because they are sorted by defoliation, will give you ranks.

This won't serve you if you have lots of tied observations, however.

Re: rank transformation

My first thought would be that if you could find a transformation of your data that would normalise it you'd make life a lot easier for yourself, not least because you'd still be able to answer questions about the data (like how much more defoliated the trees are at station A then station B) that you'll have difficulty answering if you start taking the ranks of it. As a separate point though, it seems to me that if you feed the ranks into a regular two-way ANOVA, you're still not really retaining the assumption of independent Normally-distributed residuals with constant variance: you're just making it less easy to demonstrate that they're not (and actually you know they're not if you're analysing ranks).

One transformation you might consider for your data is an arcsine square root transformation, which is the one usually recommended when your data is a proportion (see for example http://udel.edu/~mcdonald/stattransform.html ). This one goes some way towards normalising variances towards the extreme ends of the scale (i.e. 0% and 100%), though even there there's not much you can do to normalise the very extreme ends.

If your data includes large numbers of instances of almost complete defoliation or almost no defoliation at all, you might consider analysing a binary variable (i.e. "defoliated" / "not defoliated") or an ordinal variable (1="not defoliated", 2="partially defoliated", 3="wholly defoliated") using a logistic regression (see for example http://udel.edu/~mcdonald/statlogistic.html ), which would still enable you to fit explanatory variables to your data like station, type of tree, age of tree, height of tree etc. Logistic regression is provided within the "Fit Model" platform: just change the personality to whatever type of logistic regression you intend to perform, and make sure that the modelling type of your response variable is set to "Nominal" or "Ordinal" (otherwise the two logistic regression options will be greyed out).

Re: rank transformation

hey,
thanks for your help, i tryed the acrsin square root transformation but the variances are still unequal.
and if i used a logistic regression i wouldn't be able to compare the degree of defoliation between the different level of the tree, right?

thanks again
statman
Super User

Re: rank transformation

First question...is the measurement system adequate? How do you measure defoliation? Is the data categorized?

Second question (set). What do you want to see? Do you want to find out if the variation is different within tree or tree-to-tree or between station? Use control charts and variability charts to see the differences.
"All models are wrong, some are useful" G.E.P. Box

Re: rank transformation

the defoliation was measured by sampling leaves and estimating the defoliation per leave, then i calculated the average defoliation per leave
and i want to see if where the defoliation is the most important ( top, middle bottom of the crown) and i also want to see which stations are significantly more defoliated then the other
sseligman
Staff

Re: How to transform data to ranks?

A quick way to transform your data to ranks is to use the Distribution platform. First, go to Analyze > Distribution and place your variable of interest in the Y role. If you have a grouping variable, place it in the By role. Click OK. Next, click on the red triangle drop down menu next to your response (if you have a By variable press and hold the Ctrl key beforehand to broadcast the action across group levels) and choose Save > Ranks or Ranks averaged. This method will save the ranks to your original data table.