cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Sign-in to the JMP Community will be unavailable intermittently Dec. 6-7 due to a system update. Thank you for your understanding!
  • We’re retiring the File Exchange at the end of this year. The JMP Marketplace is now your destination for add-ins and extensions.
  • JMP 19 is here! Learn more about the new features.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
datanaut
Level III

Formula to check if any of a list of items is contained in the rows of a column

JMP for Mac 

Version 17.2.0 (701896)

 

I have a column "Data" with rows of number_number like this:

1_3

3_7

6_5

...

 

I want to test this column to see if some of the rows contain values from a list of values and then flag or label the rows which do.

 

I know I could construct an elaborate formula like this:

If( :Die X_Y == "3_9" | :Die X_Y == "3_4",
	1,
	0
)

But that's very laborious for long lists of different values to check.  I'd like to just be able to paste a list of values to check for in a formula, delimited by a space or comma.  

 

I found the function "Contains Item."  It works fine for one item but the help entry doesn't offer any insight on how to construct the "list" option.  I tried this (specifying the optional delimiter as a space):

If( Contains Item( :Die X_Y, "3_4 3_9", " " ),
	"exclude",
	"not exclude"
)

and this:

If( Contains Item( :Data, "3_4,3_9", "," ),
	"exclude",
	"not exclude"
)

I understand the default delimiter is a comma but wanted to enforce to find anything that works.  Neither Contains Item function is returning the sought-for behavior (logical 1 if there is a match, 0 if not).  If there is a better function to use please advise.  

 

This is a generalized problem I'd like to be able to solve -- inspecting a column to find matches with long lists of items.  

 

Thanks

 

15 REPLIES 15
hogi
Level XIII

Re: Formula to check if any of a list of items is contained in the rows of a column

the reading is: 3 arguments

  1. x - a string of characters
  2. 3 options:
    - item to search
    - a list of several items - does one of them match?
    - a pattern to search
  3. optional argument: the delimiter to split x in words

 


@datanaut wrote:

So (to my simplistic reading) either x or the {list} could contain the words to be searched or the list of words to compare it with.  


-> yes.

 

 included a Column as in Jarmo's example 

For a column formula, this is always possible, along the idea:
For every row, JMP will take automatically the entry of the column in the respective row.
When you use Formula editor, you do this trick all the time - just without noting.

hogi
Level XIII

Re: Formula to check if any of a list of items is contained in the rows of a column

A nice side effect:

This discussion made me think:
does Contains also increase the speed of Col .. aggregations?

And believe it or not - it does. wow, it does!!!!

https://community.jmp.com/t5/Discussions/How-do-I-use-the-Col-Maximum-Formula-with-a-quot-where-quot... 

 

Any clue why the character comparison is SOOO slow?

hogi
Level XIII

Re: Formula to check if any of a list of items is contained in the rows of a column

I wondered if deleting rows without selecting them is much faster:

dt << select rows(myRows) << delete rows();
dt << delete rows();

[actually, I wondered why the corresponding << exclude(myRows) is missing.]

 

So, here is another speed test:

Names Default to here(1);
get timing = Function( {myExpr},
	dt = New Table( "test", add rows( 10000000 ), New Column( "rnd", set each value( Random Integer( 1000 ) ) ) );
	myRows = Random Shuffle( Index( 1, N Rows() ) )[Index( 1, N Rows() / 2 )];
	t0 = HP Time();
	myExpr;
//print(n rows()); // same timing without this line
	If( N arg( myExpr ) > 2,
		Print( Char( Head( Arg( myExpr, 2 ) ) ) || " + " || Char( Head( Arg( myExpr, 3 ) ) ) ),
		Print( Char( Head( Arg( myExpr, 2 ) ) ) );
	);
	Print( Char( (HP Time() - t0) / 1000000 ) || " s\!n" );
	Close( dt, noSave );
);

get timing( Expr( dt << select rows( myRows ) << delete rows ) );

get timing( Expr( dt << delete rows( myRows ) ) );

get timing( Expr( dt << select rows( myRows ) << exclude( 1 ) ) );

Print("-------------");

get timing( Expr( dt << select where( :rnd > 500 ) ) );

get timing( Expr( dt << get rows where( :rnd > 500 ) ) );

get timing( Expr( myRows = Where( dt, :rnd > 500 ) ) );

Print("_____________");
	

hogi_1-1764746400738.png

So,  deleting rows without selecting them is just marginally faster.
-> no wonder there is no corresponding function for Exclude.

hogi
Level XIII

Re: Formula to check if any of a list of items is contained in the rows of a column

by the way:

Print("_________");
Print("__X______");
Print("X________");

hogi_2-1764747078244.png

why?

jthi
Super User

Re: Formula to check if any of a list of items is contained in the rows of a column

My guess is that Print is doing some parsing/evaluations as it is meant to be used for displaying values of arguments. And when you start -"variable" with two underscores it is considered hidden -> possibly some interaction with Print() to not make it leak those values.

-Jarmo
hogi
Level XIII

Re: Formula to check if any of a list of items is contained in the rows of a column

show("__");
myString="__";
show(myString);
show("_"|| "_");

__x="secret";
show(__x);
show(name Expr(__x));
show(Eval(__x));

hogi_0-1764749234016.png

 

a second layer of ulra-tight protection ; )

Recommended Articles