Discussions

BHarris · Apr 22, 2025 01:14 PM

Suppose I have this table:

Category	Item	Level
A	1	delta
A	2	foxtrot
A	3	hotel
B	1	juliet
B	3	lima

Note that Category=B, Item=2 is missing.

If I try to "split" this table with "Split By" = Category and "Split Columns" = Level, and "Keep All", it returns this:

Item	A	B
1	delta	juliet
2	foxtrot	lima
3	hotel

Note that on the newly Split table, it shows that Item=2,Category=B is "lima", but that's incorrect -- that cell should be blank, and "lima" should be under Item=3,Category=B.

Is this user error, or is this a bug in JMP?

hogi · May 10, 2025 02:16 AM

You mentioned it, but maybe it helps to write it explicitly:

JMP applies the rule:

- Group: Keep and use for grouping

- Keep: keep (~ don't drop)

@BHarris wrote:

Now I'm left trying to figure out if all of the other columns should be grouping columns if I'm doing "Keep all"...

I guess: yes.
You could try both ways, check the differences - and then decide.
Group helps you to "structure" the data. If the data is already structured, it doesn't hurt. If it is not structured, it will guarantee a meaningful output.
With Keep you have to know very precisely what you are doing. Below are some examples.

These little discussions in the community help a lot to understand - to challenge and polish the user's knowledge.
Here is a Collection of "funny" Jmp newbie questions that came up in our user coaching sessions. I've collected them to help other users understand JMP. Many of the questions are just rhetoric one - they illustrate that sometimes it's just the "JMP slang" that makes understanding JMP logic non-trivial.

@BHarris wrote:

Until then, it will likely continue to make me nervous, as its behavior for me in these conditions isn't obvious, results in faulty output data under conditions that may be hard to identify, and is not well-enough documented for me to understand. At least that nervousness will keep me more alert in the future when using it.

I agree. It's very important to "know what JMP is doing" - and to know it down to a level of detail that the user feels capable to do the right thing. Not nervous, not surprised. If something unexpected happens, a user starts to feel anxious - where are the other Places where Jmp does something unexpected ?!?!

Maybe there is a great feature in JMP, but it annoys or frightens users - at least those who don't expect what JMP does ...
Having a map of these trap doors can help users escape them. Every JMP user should have such a map to "feel safe" working with JMP.

Even if the details are well documented - JMP is soooo powerful, at the same time there is too litte and too much documentation!
This makes it difficult to find the right information. The LearnBot makes it much easier now, but it still does not know all the details.

hogi · May 10, 2025 1:15 AM

An extreme case for Split:

Keep: name

no grouping column

With sex = M, F, 2 rows are merged into one row and many "names" are lost:

The first *) item wins, so we start with female names, the corresponding male names are dropped.
All male names? no -- at the end, there are some male students left.

*) There is NO universal rule which holds for all tables platforms:

which value will be kept and which value will be lost.

Join also keeps the first value (and doesn't care about subsequent values).
Update "keeps" every single value - and overwrites the previous entry - the last one "sticks".
This is why Tables/Update is so slow (it can take hours to execute the command): Speed up Tables/Update

hogi · May 11, 2025 2:49 AM

@hogi wrote:

The first *) item wins

oops ... this is just true for columns added via Keep.

For the Split Columns, Split behaves like Update: the last one "sticks".

I tried to illustrate it with this example where I marked the "name" entries and the M/F split entries which end up in the final data set. One can see:
- columns added via Keep (like name) : Rank=1 (take first one, skip subsequent ones)

- rows added via Split Columns ( height --> M/F): RankReverse=1 (last one "sticks")

hogi · May 11, 2025 2:50 AM

The behavior follows a logic, but one can argue whether the logic is straightforward / expected by every user : )

~~I wonder if Keep All is the right choice as default? If JMP starts with~~

~~... then a user who activates the option Keep All can consult the documentation to be sure that it does what he wants.~~
JMP could even show a warning: "Are you sure that you want to use Keep? better use Group?"

edit: sorry, I proposed what is already there:
This Platform already starts in "safe mode", with the "Drop all" setting.
And once the user activates "keep all", JMP SHOWS a warning [the one in post N-2].

________________________________________________

Under the line: No surprises | anxiety for users who don't use the option.

Much more dangerous than Split: Tables /Update
The default interface also starts with "All"
a) for Add Columns

b) for Replace columns (!)

Some years ago, I considered it dangerous enough to write an AddIn which replaces the default settings with:

The additional befit: with this change the platform doesn't start with a setting that tries to Update thousands of columns.
This makes it orders of magnitude faster than the original one : )
https://community.jmp.com/t5/JMP-Wish-List/Speed-up-Tables-Update/idc-p/653334/highlight/true#M4531

hogi · May 10, 2025 05:14 AM

I added the topic to Caution: Places where Jmp does something unexpected

BHarris · May 10, 2025 07:20 PM

I'd love to see an animation someday that shows what split (and stack) are doing. I imagine the split-by column values being pushed up into the column headers, and the split-columns getting pushed into the cells of those new columns. (I was surprised just now to learn that if there are multiple split-columns for a given split-by column, then new numbered columns are created.)

But the real interesting part comes when there are conflicts, e.g. multiple values trying to end up in the same cell. JMP does give a warning that this is happening (which is great!), but this whole post came about from a dataset that looked like it was being split correctly, but after I dug into it I realized it wasn't, and I had ignored that warning because I didn't understand it.

My 2 cents, the warning should be changed from: "Multiple rows, possibly with different values, of the columns to be kept are mapped to the same row in the split table. Only one value of each column is retained." ---> to: "Multiple values being mapped to the same cell; only one is kept."

hogi · May 11, 2025 8:49 AM

@BHarris wrote:

I was surprised just now to learn that if there are multiple split-columns for a given split-by column, then new numbered columns are created.

if multiple split-columns are used, JMP creates the column names as a combination of the name of the split-column - and the entry of the split-by-Column:

I only one column is selected as split-column, the name of the split-column is not included in the column name:

My colleagues roll their eyes when I start using Big Class to explain a functionality of JMP.
But it's a wonderful trick to use an alternative data set to play with the settings:
with "M" and "F" it's obvious what JMP is doing, no need to wonder about the "numbers" that appeared with the original data set.

Big Class looks very simple, but it's complicated enough to explain many of the details.
Just collinearities are missing ...

Discussions

Bad Splitting?

Re: Bad Stacking?

Re: Bad Stacking?

Re: Bad Stacking?

Re: Bad Stacking?

Re: Bad Stacking?

Re: Bad Stacking?

Re: Bad Stacking?

Recommended Articles