cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
BHarris
Level VI

Bad Splitting?

Suppose I have this table:

 

Category Item Level
A 1 delta
A 2 foxtrot
A 3 hotel
B 1 juliet
B 3 lima

 

Note that Category=B, Item=2 is missing.

 

If I try to "split" this table with "Split By" = Category and "Split Columns" = Level, and "Keep All", it returns this:

 

Item A B
1 delta juliet
2 foxtrot lima
3 hotel  

 

Note that on the newly Split table, it shows that Item=2,Category=B is "lima", but that's incorrect -- that cell should be blank, and "lima" should be under Item=3,Category=B.

 

Is this user error, or is this a bug in JMP?

 

16 REPLIES 16
hogi
Level XII

Re: Bad Stacking?

You mentioned it, but maybe it helps to write it explicitly:

JMP applies the rule:

- Group: Keep and use for grouping

- Keep: keep (~ don't drop)

 


@BHarris wrote:

Now I'm left trying to figure out if all of the other columns should be grouping columns if I'm doing "Keep all"...


I guess: yes.
You could try both ways, check the differences - and then decide.
Group helps you to "structure" the data. If the data is already structured, it doesn't hurt. If it is not structured, it will guarantee a meaningful output.
With Keep you have to know very precisely what you are doing. Below are some examples.

 

These little discussions in the community help a lot to understand - to challenge and polish the user's knowledge.
Here is a Collection of "funny" Jmp newbie questions  that came up in our user coaching sessions. I've collected them to help other users understand JMP. Many of the questions are just rhetoric one - they illustrate that sometimes it's just the "JMP slang" that makes understanding JMP logic non-trivial.

 


@BHarris wrote:

Until then, it will likely continue to make me nervous, as its behavior for me in these conditions isn't obvious, results in faulty output data under conditions that may be hard to identify, and is not well-enough documented for me to understand.  At least that nervousness will keep me more alert in the future when using it.


 

I agree. It's very important to "know what JMP is doing" - and to know it down to a level of detail that the user feels capable to do the right thing. Not nervous, not surprised. If something unexpected happens, a user starts to feel anxious -  where are the other  Places where Jmp does something unexpected ?!?!

Maybe there is a great feature in JMP, but it annoys or frightens users  - at least those who don't expect what JMP does ...
Having a map of these trap doors can help users escape them. Every JMP user should have such a map to "feel safe" working with JMP.


Even if the details are well documented - JMP is soooo powerful, at the same time there is too litte and too much documentation!
This makes it difficult to find the right information. The LearnBot makes it much easier now, but it still does not know all the details.

hogi
Level XII

Re: Bad Stacking?

An extreme case for Split:

Keep: name

no grouping column

hogi_1-1746858820638.png

With sex = M, F, 2 rows are merged into one row and many "names" are lost:

hogi_0-1746864850514.png

The first *) item wins, so we start with female names, the corresponding male names are dropped.
All male names? no -- at the end, there are some male students left.

 

 

*) There is NO universal rule which holds for all tables platforms:

which value will be kept and which value will be lost.

Join also keeps the first value (and doesn't care about subsequent values).
Update "keeps" every single value - and overwrites the previous entry - the last one "sticks".
This is why Tables/Update is so slow (it can take hours to execute the command): Speed up Tables/Update 

hogi
Level XII

Re: Bad Stacking?


@hogi wrote:

The first *) item wins


oops  ... this is just true for columns added via Keep.

For the Split Columns, Split behaves like Update: the last one "sticks".

 

I tried to illustrate it with this example where I marked the "name" entries and the M/F split entries which end up in the final data set. One can see:
- columns added via Keep (like name) : Rank=1 (take first one, skip subsequent ones)

- rows added via Split Columns ( height --> M/F): RankReverse=1 (last one "sticks")

hogi_1-1746956974003.png

 

hogi
Level XII

Re: Bad Stacking?

The behavior follows a logic, but one can argue whether the logic is straightforward / expected by every user : )


I wonder if Keep All is the right choice as default? If JMP starts with 

hogi_0-1746867028288.png

... then a user who activates the option Keep All can consult the documentation to be sure that it does what he wants.
JMP could even  show a warning: "Are you sure that you want to use Keep? better use Group?"


edit: sorry, I proposed what is already there:
This Platform already starts in "safe mode", with the "Drop all" setting.
And once the user activates "keep all", JMP SHOWS a warning [the one in post N-2].

 

________________________________________________

Under the line: No surprises | anxiety for users who don't use the option.

 

 

Much more dangerous than Split: Tables /Update
The default interface also starts with "All" 
a) for Add Columns

b)  for Replace columns (!)

Some years ago, I considered it dangerous enough to write an AddIn which replaces the default settings with:

hogi_1-1746867175838.png

The additional befit: with this change the platform doesn't start with a setting that tries to Update thousands of columns.
This makes it orders of magnitude faster than the original one : )
https://community.jmp.com/t5/JMP-Wish-List/Speed-up-Tables-Update/idc-p/653334/highlight/true#M4531  

hogi
Level XII

Re: Bad Stacking?

BHarris
Level VI

Re: Bad Stacking?

I'd love to see an animation someday that shows what split (and stack) are doing.  I imagine the split-by column values being pushed up into the column headers, and the split-columns getting pushed into the cells of those new columns.  (I was surprised just now to learn that if there are multiple split-columns for a given split-by column, then new numbered columns are created.)

 

But the real interesting part comes when there are conflicts, e.g. multiple values trying to end up in the same cell.  JMP does give a warning that this is happening (which is great!), but this whole post came about from a dataset that looked like it was being split correctly, but after I dug into it I realized it wasn't, and I had ignored that warning because I didn't understand it. 

 

My 2 cents, the warning should be changed from: "Multiple rows, possibly with different values, of the columns to be kept are mapped to the same row in the split table. Only one value of each column is retained."  ---> to:  "Multiple values being mapped to the same cell; only one is kept."

hogi
Level XII

Re: Bad Stacking?


@BHarris wrote:

I was surprised just now to learn that if there are multiple split-columns for a given split-by column, then new numbered columns are created.


if multiple split-columns are used, JMP creates the column names as a combination of the name of the split-column - and the entry of the split-by-Column:

hogi_0-1746958543116.png

I only one column is selected as split-column, the name of the split-column is not included in the column name:

hogi_1-1746958614612.png


My colleagues roll their eyes when I start using Big Class to explain a functionality of JMP.
But it's a wonderful trick to use an alternative data set to play with the settings:
with "M" and "F" it's obvious what JMP is doing, no need to wonder about the "numbers" that appeared with the original data set.

Big Class looks very simple, but it's complicated enough to explain  many of the details.
Just collinearities are missing ...



Recommended Articles