New in JMP 11: Make Binning Formula

JMP 11 provides a convenient way to convert a continuous numeric column into a new column that represents a sequence of ranges. The command is called Make Binning Formula in the Columns menu. It brings up a dialog that lets you interactively choose the binning parameters (offset and width) and the format of the new column.

Here is the heart of the dialog being used to bin the Birth Years column from the Consumer Preferences sample data file.

I've set the parameters to create 10-year bins corresponding to decades. The drop-down menu in the upper right offers a variety of output formats, and I've chosen “Low – High-1”  -- which works well to clarify the endpoints for integer data. In mathematical terms, the range is closed at the low value and open at the high value. The result is a new column that contains the following formula:

The formula is straightforward since it doesn't create the actual text to describe the range. That part is done by a Value Label column property, which is added automatically. Here are a few rows of the new column, which I renamed "Birth Decade," next to the original column.

Now, why are we doing all this? In general, it can be useful to create bins to look for coarser effects in a data set. In my example, I want to create a histogram-like bar graph so I can compare the counts of males and females in a single graph. Here’s the regular histogram view of Birth Year for both genders.

After making the binned column, I can use Graph Builder with a bar element to interleave the bars and focus on the differences in the two groups by decade.

Visitor

Mike Clayton wrote:

Thanks, that is useful.

What is the real data stratification in those bins, which is hidden in histogram unless you spread out THOSE bins..which is what I normally do first...and then highlight the categorical M F. Should be data for every year in most real data examples, so many more bins possible in histogram view.

That is uglier but also useful to think about M F by year.

But of course if you only have a few years (as you example shows) then no gain from higher resolution. But decade has no meaning to me in this context unless some other factor relates to decades. Will have to look at the raw data example as it relates to my normal studies of categorical variable relationships. But nice way to make good looking graph, assuming it does not mislead the analysts to assume data sampling was richer than it was.

Visitor

Kevin wrote:

How do I set bins in JMP 9.0.0? Analyze - Distribution is making one bin for each unique value. Since I have ~1000 unique values, the histogram runs off the screen. I havent been able to see one histogram yet.

Staff

Xan Gregg wrote:

Kevin, The JMP Discussion Forum is the best place for such questions, but a quick guess is that your data is marked as Nominal instead of Continuous.