cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
jthi
Super User

Bug when using combination of set each value, tables with same column name(s) and col statistical functions within a loop

This has already been reported to JMP support and they have submitted it as a bug to development (00049239). Seems to happen at least with JMP16.2 and JMP17.0

 

Just as a heads up, there is fairly bad bug with JMP when using the mentioned combination (set each value, loop and col statistical functions). What happens is that JMP does use data from wrong table when adding data to different table. One workaround is to provide table preference within << Set Each Value (<< Set Each Value(Col Rank(Row(), dt:name))).

 

You seem to need (based on my fairly quick tests):

  1. Data tables with same column name(s)
  2. Loop over those
  3. Use << Set Each Value to set values in column
  4. The formula used should use Col statistical function WITH byVar (and most likely this has to have same name in both tables)

 

Below are some test cases

Names Default To Here(1);
// https://www.jmp.com/support/help/en/17.0/#page/jmp/scoping-operators.shtml#

// Initialize data
test_tables = {};
dt = Open("$SAMPLE_DATA/Big Class.jmp");
dt << New Column("REAL NAME RANK", Numeric, Continuous, Formula(Col Rank(Row(), :NAME)));
Column(dt, "age") << Set Modeling Type("Continuous");
Insert Into(test_tables, dt);
dt = Open("$SAMPLE_DATA/Students1.jmp");
dt << Sort(By(:name), Replace Table, Order(Ascending)); // sort to create worse scenario
dt << New Column("REAL NAME RANK", Numeric, Continuous, Formula(Col Rank(Row(), :NAME)));
Column(dt, "age") << Set Modeling Type("Continuous");
Insert Into(test_tables, dt);
/*
dt = Open("$SAMPLE_DATA/Students2.jmp");
dt << Sort(By(:name), Replace Table, Order(Ascending)); // sort to create worse scenario
dt << New Column("REAL NAME RANK", Numeric, Continuous, Formula(Col Rank(Row(), :NAME)));
Column(dt, "age") << Set Modeling Type("Continuous");
Insert Into(test_tables, dt);
*/

// use idx 1 to skip first table
min_idx = 0;

// without reversing we will run out of index as we are using wrong table
// but if we reverse we will get worse outcome (in my opinion) as we end up with wrong data
reverse_tableorder = 0; 


If(reverse_tableorder,
	test_tables = Reverse(test_tables);
);

For Each({dt, idx}, test_tables,
	// works
	// dt << New Column("COPY", Character, Nominal, << Set Each Value(:NAME));
	// dt << New Column("MULTI", Numeric, Continuous, << Set Each Value(:age * :age));

	If(idx > 0, // if the column isn't created to first table seems to work fine?
		// Doesn't work (// old table used due to some col function optimizations when using byVar?)
		Try(
			dt << New Column("RANK", Numeric, Continuous, << Set Each Value(Col Rank(Row(), :NAME)));
		,
			show(exception_msg);
		);

		Try(
			new_col = dt << New Column("RANK_SEPARATE", Numeric, Continuous);
			new_col << Set Each Value(Col Rank(:Row(), :name));
		,
			show(exception_msg);
		);

		Try(
			new_col = dt << New Column("SUM", Numeric, Continuous, << Set Each Value(Col Sum(1)));
		,
			show(exception_msg);
		);

		Try(
			new_col = dt << New Column("SUM_WITH_GROUP", Numeric, Continuous, << Set Each Value(Col Sum(1, :name)));
		,
			show(exception_msg);
		);

		Try( // with reverse_tableorder = 1, :age is taken from correct table BUT the group sizes from wrong? 
		// KATIE -> 72 (6*12) and not 11+12+12+13+14+15 -> 77
			new_col = dt << New Column("SUM_OF_COL", Numeric, Continuous);
			new_col << Set Each Value(Col Sum(:age, :name));
		,
			show(exception_msg);
		);

		// workaround
		dt << New Column("RANK_WITH_REF", Numeric, Continuous, << Set Each Value(
			Col Rank(Row(), dt:NAME)
		));
	);
);
-Jarmo
4 REPLIES 4
jthi
Super User

Re: Bug when using combination of set each value, tables with same column name(s) and col statistical functions within a loop

Got some good information not so good information from JMP support.

 

Good:

This is a bug with Col Rank function (not sure if other Col functions were tested) and will be fixed at some point (fix won't be there for JMP17.2). 

 

Definitely not so good:

<< Set Each Value shouldn't be used to set values to columns unless it is a constant or 'global function' (today()) and first argument in Col statistical functions should always be column reference. The current behaviours aren't considered bugs but rather unintended behaviours so they are a subject to change any point without any information. I will create wish list item so they would make these into supported behaviours instead of unintended behaviours.

 

Edit:Make using formula expressions in Set Each Value and using expressions as first argument in Col stat... 

-Jarmo
hogi
Level XII

Re: Bug when using combination of set each value, tables with same column name(s) and col statistical functions within a loop

Col statistics with the strict requirement to use a column as the fist  argument - forbidden to use an expression ?
This will be a huge step backwards!

 

Many of the beautiful tricks of @brady_brady wont work anymore

 

 

hogi
Level XII

Re: Bug when using combination of set each value, tables with same column name(s) and col statistical functions within a loop

with JMP18: great, seems to be fixed:

hogi_1-1726776134928.png


but Col Sums seem to still have an issue ...

hogi_2-1726776156148.png

 

this issue can be fixed by explicitly referring to dataTable:column, analogous to Jarmo's example in the fist post:

new_col = dt << New Column("SUM_WITH_GROUP", Numeric, Continuous, << Set Each Value(Col Sum(1, dt:name))); // dt:name instead of name

new_col << Set Each Value(Col Sum(dt:age, dt:name)); // dt:age(name) instead of :age(name)
hogi
Level XII

Re: Bug when using combination of set each value, tables with same column name(s) and col statistical functions within a loop

a related issue with formula columns:

How does JMP evaluate a column formula?