Hi all - was wondering if someone could give me a few pointers on a problem I'm struggling with. I would like to create a dummy variable using a range of values based on a continuous variable.
I would like to take a set of values and place them in nominal "buckets" if you will.
For example the data is
I want take those values and id when range they fall into. So 1- 12, 13-20, etc. Any ideas??
select the column you are interested in, and then go to
and create a new column that has the values binned into new groups
done. But what about large datasets? The one I was working on was only 450 rows. If I have a dataset with 29K rows, can I use a formula to go through and recode everything?
1. Yes you can use a formula. Here is one version that you could use
If( Row() == 1, min = Col Min( :height ); incr = (Col Max( :height ) - Col Min( :height )) / 9; ); Floor( (:height - min) / incr ) + 1;
It uses the Big Class sample data table. Just change the variable :Height out for the column you are calculating against.
2. But the number of rows does not affect the Recode. Only the number of unique values within those rows.
I think that Jim already answered your question, but I wanted to expand a bit on the answer. The If() function in a formula or script is very versatile. Here is an example of what you want to do (bin data) from our JMP Software: Introduction to the JMP Scripting Language:
It compares the numeric value in Age with the cutoffs and returns a character string in the new column, Stage.
I tried to follow this example using a column formula. I get no error but the column is blank.
if "Pilot A", etc, is a string, it needs quotation marks. There might be a message in the log about an undefined variable.
I forgot to mention that if you are not otherwise scripting, you can interactively create a new data column with the bins: select the data column with the continuous values and then select Cols > Utilities > Make Binning Formula.
There are no labels assigned to this post.