Solved: Re: Better way to select columns matching criteria in another column in same tab...

Ressel · Nov 7, 2022 10:25 AM

The script at the end of this post basically does what I want it to do (i.e., finding columns where a Y-value is 100 for time = 0), but it also generates a log message that makes me a little uncomfortable (having stepped into traps before):

A message with an empty Boolean argument enabled an option that was already enabled. That option remains enabled. In previous versions of JMP, this would have disabled the option.

This message is printed to the log for every column that is selected again, after it has already been selected once as part of my loop construction. As someone relatively new to scripting, I have two main concerns:

Is this log message something I absolutely should avoid and will I run into trouble? I will eventually be processing larger tables, so this log message may be printed hundreds of times for a single script execution.
Is there a smarter way to find all columns in a table that have a Y-value = 100 for each row where time = 0? (In my case, I will be processing tables with many columns, some of them containing results that are normalized to 100% rel. at time =0. These are the columns I am interested in.)

Names Default To Here( 1 );

// table required for example
dt = New Table( "Results (absolute & relative) vs time",
	Add Rows( 8 ),
	New Column( "time",
		Numeric,
		Set Values( [0, 1, 2, 3, 0, 1, 2, 3] ),
	),
	New Column( "Y1_absolute",
		Numeric,
		Set Values( [7, 8, 1, 9, 0.2, 0.4, 0.65, 0.98] ),
	),
	New Column( "Y1-relative",
		Numeric,
		Set Values( [100, 114, 14, 129, 100, 200, 325, 490] ),
	),
	New Column( "Y2_absolute",
		Numeric,
		Set Values( [12, 6, 4, 0, 80, 90, 120, 75] ),
	),
	New Column( "Y2_relative",
		Numeric,
		Set Values( [100, 50, 33, 0, 100, 113, 150, 94] ),
	)
);

// selecting columns where for :time == 0 a Y-value == 100 is found
For( i = 1, i <= N Row( dt ), i++,
	For( j = 1, j <= N Col( dt ), j++,
		If( Column( j )[i] == 0,
			For( k = 1, k <= N Col( dt ), k++,
				If( Column( k )[i] == 100,
					Column( k ) << Set Selected
				)
			)
		)
	)
);

pauldeen · Nov 7, 2022 11:26 AM

Adding (1) after << set selected takes care of the error message. Instead of toggling it, you are defining the end state you want:

For( i = 1, i <= N Row( dt ), i++,
	For( j = 1, j <= N Col( dt ), j++,
		If( Column( j )[i] == 0,
			For( k = 1, k <= N Col( dt ), k++,
				If( Column( k )[i] == 100,
					Column( k ) << Set Selected(1)
				)
			)
		)
	)
);

View solution in original post

pauldeen · Nov 7, 2022 11:26 AM

Adding (1) after << set selected takes care of the error message. Instead of toggling it, you are defining the end state you want:

For( i = 1, i <= N Row( dt ), i++,
	For( j = 1, j <= N Col( dt ), j++,
		If( Column( j )[i] == 0,
			For( k = 1, k <= N Col( dt ), k++,
				If( Column( k )[i] == 100,
					Column( k ) << Set Selected(1)
				)
			)
		)
	)
);

Ressel · Nov 7, 2022 11:40 AM

Much appreciated! I think I will edit that loop a little further.

There's no reason to traverse through the entire table once the correct columns have been identified. (Edit: that might be a bit hard to do with my current skill level. Different Y columns may have missing values in different rows. I need to find all Y columns where time = 0 matches with one or more Y-value = 100.)
Why loop through all rows and not just the ones where time is actually zero?

Edit: Stayed with the solution presented. Looks robust and didn't want to use yet more time. But bullet #2, i.e., not looping through all rows of the table is something I should and probably will look into at some point.

pauldeen · Nov 7, 2022 01:52 PM

Some ideas:

You could try and subset the table on time=0 before you proceed with your loop. That would unpack one layer of your for loop.

You could build a list of columns and remove the ones you have selected. Then for each next iteration you would only have to loop through the remaining unselected columns.

Ressel · Nov 7, 2022 03:02 PM

Thank you again. Subsetting did improve the performance on my real (large) table. Improved solution posted below. I had thought about this before, but obviously needed a short break to get there. What I hadn't even considered though was building a list of columns and iteratively removing items. Thank you also for that!

Names Default To Here( 1 );

// new table required for example
dt = New Table( "Results (absolute & relative) vs time",
	Add Rows( 8 ),
	New Column( "time",
		Numeric,
		Set Values( [0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3] ),
	),
	New Column( "Y1_absolute",
		Numeric,
		Set Values( [7, 8, 1, 9, 0.2, 0.4, 0.65, 0.98, 0.7, ., 0.83, 0.4] ),
	),
	New Column( "Y1-relative",
		Numeric,
		Set Values( [100, 114, 14, 129, ., 200, 325, 490, 100, 330, 540, 213] ),
	),
	New Column( "Y2_absolute",
		Numeric,
		Set Values( [12, 6, 4, 0, 80, 90, 120, 75, 8.3, 9.7, 6.4, 5.2] ),
	),
	New Column( "Y2_relative",
		Numeric,
		Set Values( [., 50, 33, 0, 100, 113, 150, 94, 100, 66, 37, 12] ),
	)
);

// finding and selecting Y columns where for any :time == 0 a value of 100 is found in the same row
dt << Select Where( dt:time == 0 );
dtTemp = dt << Subset ( Selected Rows, Selected Columns Only( 0 ) );

For( i = 1, i <= N Row( dtTemp ), i++,
	For( j = 1, j <= N Col( dtTemp ), j++,
		If( Column( dtTemp, j )[i] == 100,
			Column( dt, j ) << Set Selected( 1 ),
		)
	)
);

Close( dtTemp, no save );
dt << Clear Select(  );

ian_jmp · Nov 8, 2022 07:00 AM

FWIW, this avoids the subset table, but is probably not much faster:

Names Default To Here( 1 );

// table required for example
dt = New Table( "Results (absolute & relative) vs time",
	Add Rows( 8 ),
	New Column( "time",
		Numeric,
		Set Values( [0, 1, 2, 3, 0, 1, 2, 3] ),
	),
	New Column( "Y1_absolute",
		Numeric,
		Set Values( [7, 8, 1, 9, 0.2, 0.4, 0.65, 0.98] ),
	),
	New Column( "Y1-relative",
		Numeric,
		Set Values( [100, 114, 14, 129, 100, 200, 325, 490] ),
	),
	New Column( "Y2_absolute",
		Numeric,
		Set Values( [12, 6, 4, 0, 80, 90, 120, 75] ),
	),
	New Column( "Y2_relative",
		Numeric,
		Set Values( [100, 50, 33, 0, 100, 113, 150, 94] ),
	)
);

// Rows that have time = 0
r0 = dt << getRowsWhere(:time == 0);

// Loop over other columns (assumes time is in the first column)
for(c=2, c<=NCol(dt), c++,
	if(Contains(Column(dt, c)[r0], 100), Column(dt, c) << setSelected(1));
);

Better way to select columns matching criteria in another column in same table?

Re: Better way to select columns matching criteria in another column in same table?

Re: Better way to select columns matching criteria in another column in same table?

Re: Better way to select columns matching criteria in another column in same table?

Re: Better way to select columns matching criteria in another column in same table?

Re: Better way to select columns matching criteria in another column in same table?

Re: Better way to select columns matching criteria in another column in same table?