cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
powerpuff
Level IV

Compare lists, but partial strings

I have one list containing column names, like Final Name 1, Final Value 2 etc. and another list that has names like Name,Value, etc. How can I use pattern matching make matches like: Name == Final Name 1 for all columns? I want to extract its row values if the column matches the values in the list. Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
uday_guntupalli
Level VIII

Re: Compare lists, but partial strings

@powerpuff
    There is a cool way that I learned from @Craige_Hales to find common elements between lists. 


List1 = {"Final Output","Final Name","Final Place","Final Animal","Final Thing"}; List2 = {"Output","Name","Place","Animal","Thing","Final Output","Final Name","Final Place", "Final Animal","Final Thing"}; AA1 = Associative Array(List1); AA2 = Associative Array(List2); Intersection = AA1 ; Intersection << Intersect(AA2); print("common (intersection) items:",Intersection<<getkeys);

(https://community.jmp.com/t5/Discussions/Quick-way-to-compare-two-lists-and-identify-the-uncommon/td...

Best
Uday

View solution in original post

8 REPLIES 8
uday_guntupalli
Level VIII

Re: Compare lists, but partial strings

@powerpuff,
            A humble suggestion I would make is please provide samples of your data in data table or screenshots - so it makes it easier for people to understand what exactly you are asking. I am making the assumption that you have 2 lists that look like below: 

 

list1 = {"Final Name 1", "Final Name 2", "Final Name 3"}; 
list2 = {"Place","Animal","Name","Thing"}; 
MatchPos = {}; 

for(i = 1, i <= N Items(list1), i++,
		If(!IsMissing(Regex(Char(list1[i]),"Name")),
			Insert Into(MatchPos,i);
		  ); 
   );
   
Show(MatchPos);

If the above is true, you can easily search and find the match against your term of interest in this case "Name" as shown above. However, I could not follow the rest of the question around rows. Hope this helps, if not kindly invest time into making your question as structured and clear as possible. 

Best
Uday
powerpuff
Level IV

Re: Compare lists, but partial strings

@uday_guntupalli I am really sorry about that. My table looks similar to this:

Capture.PNG 

The list I have is list = {Name, Value, Result, Formula}

I have about 65 columns and the list contains about 30 names. 

Kevin_Anderson
Level VI

Re: Compare lists, but partial strings

Hi, powerpuff!

 

Here are a couple of list manipulation functions from Chapter 5 of the Second Edition of JSL Companion:

 

// JSL Companion: Applications of the JMP Scripting Language 
// Title: 	5_Extra_CustomListFunctions.jsl 
// Version: Windows JMP® 13.2.1
// Purpose: Provides examples of writing user functions. 

/*Notes:
    These functions are provided for learning as well as utility.
    For any script, where you would like to use these functions,
    use and Include("path\5_Extra_CustomListFunctions.jsl");
    
    The script will define 2 functions. After each is a comment
    block /*: Usage & Test examples ...  --- end Test*/. Remove
    The comment block and run the examples one line at a time. Open
    the embedded log or have the log window visible to see 
    the results.
  
  Motivation:  
    As seen in Chapter 5, a numeric function on a single layer
    list of numbers, or a vector, or a matrix, will return the
    same object with the function applied to each item. For example,
    xx = Sqrt({1,2,3,4,9}); //xx = {1, 1.4142135623731, 1.73205080756888, 2, 3}
     
    We have suggested that the same apply to String functions (that operate
    on a  single item) on a list  of text.  What we would like is
    xx = Substr({"Quaker", "Minnesota", "Nebraska"}, 1,1); would return
    xx = {"Q", "M", "N"}. However, currently most JMP String functions do not 
    allow list arguments.
  
  Functions defined here, are:
    - LFunc( mylist, expr); where expression is like expr(Substr(x,1,1)).
    
    - Like(SourceStr,SearchStr); both arguments are text/strings.  The second
      argument uses % and * as wild cards. Note this a simplified Regex.
     
    
    LFunc can save writing for-loops, Like save looking up the REGEX syntax.
*/

      
//------LFunc-------------------------------------------------------------------   
::Lfunc = Function( {xList, fExpr/*ex. Expr(Contains(x,"h"))*/},
	{x = {}, i = 1, errCode = 0, errStr = "\!NError bad argument", newExpr},
	If( !Is List( xList ),
		errCode = 1
	);
	If(
		Type( Name Expr( fExpr ) ) == "Name", 
 //if Name then JMP does not recognize the function
			tmp = Trim( Char( Name Expr( fExpr ) ) );
  //extract the function and check if it is contained in the global namespace, if not error!
			If( Namespace( "global" ) << Contains( Word( 1, tmp, ":(" ) ) == 0,
				errCode = errCode + 2
			);,
		Type( Name Expr( fExpr ) ) != "Expression", errCode = errCode + 2
	); 
 //catches typos, returns Name if not a valid function
	//catches typos, returns Name if not a valid function
	If( !Is Expr( Name Expr( fExpr ) ),
		errCode = errCode + 2
	);   //catches syntax
	If(
		!errCode,
			For( i = 1, i <= N Items( xList ), i++,
				newExpr = Substitute( Name Expr( fExpr ), Expr( x ), xList[i] );
				Insert Into( x, Try( newExpr, Empty() ) );
			),
		errCode == 1, Write( errStr || "(1) not a list \!N" ),
		errCode == 2, Write( errStr || "(2) not a valid expression \!N" ),
		errCode == 3, Write( errStr || "s (1) is not a list and (2) is not a valid expression \!N" )
	);
	x//return x
	;
); //End Lfunc
//------------------------------------------------------------------------------   
 
//Usage & Test ...remove the slash asterisk and run in segments, results are in the log
//------------------------------------------------------------------------------    
  myList={"john", "harry", "huh","ah ha", "xyx", , "a", "h", 7, [1,7]};
  
  /* returns a list of numeric values, one value for each item in the list 
   n =>1st location of h, 0 => does not contain, Empty() => the item is not 
   a valid argument for that expression.
  */
zz = Lfunc(myList,Expr(Contains(x,"h")));  
show(zz);
//-----------------------------------------------------------------------------------------
/* returns a list of how many words are in each string 
   [] -> no occurrences
   Empty() => the expression is not valid for that item.
*/
zz = Lfunc(mylist, expr(nitems(words(x))));
show(zz);
//-----------------------------------------------------------------------------------------
/* This specified function requires numeric arguments 
   Empty() => the expression is not valid for that item.
   A numeric function applied to a matrix returns a matrix 
   of the results applied on each element in the matrix.
   Only the last 2 items are valid arguments for Log()
*/

zz = Lfunc(mylist, expr(log(x)));
show(zz);
//----------------Error Checking---------------------------------------------------------------
zz = Lfunc("oh johhny", Expr(Contains(x,"h")) );     //not a list
show(zz);
zz = Lfunc({"oh johhny"}, Expr(Contians(x,"h")) );  //type not a valid expr due to typo
show(zz);
zz =  Lfunc("oh johhny", Expr(Contians(x,"h")) );   //not a list and not a valid expr
show(zz);
//----- end test */

//====================================================================================


//------Like-------------------------------------------------------------------   
//--- New function that converts the SQL "like" format to a regular expression
//    Uses % wildcard, ex A%B%C  must start with A, and end with C and B in the middle  
//    %A%B%C% and pattern with A, B and C in that order
//    We think this covers most cases where reg expr special characters 
//    need an escape, a forward slash, ex.  ) needs \)  . needs \. etc.
//    If you find others, add to mList. 

::Like = Function( {SourceStr, SearchStr},
	{patStr = "^", i, ret, mList = {"(", ")", "$", "[", "]", ".", "*", "^"}}, 

	Substitute Into( SearchStr, "%%", "-|~" );  //cheat , will not handle -|~
	For( i = 1, i <= N Items( mList ), i++,
		Substitute Into( SearchStr, mList[i], "\" || mList[i] )
	);
	Substitute Into( SearchStr, "%", ".*" );
	Substitute Into( SearchStr, "-|~", "%" );   //replace 
	patStr ||= SearchStr;
	ret = Try( Regex Match( SourceStr, patStr )[1], "" );
 //show(patStr, ret,SourceStr);
	If( Length( ret ) > 0 & ret == SourceStr,
		SourceStr,
		""
	);
); //end Like

/* Usage & Test ... remove the slash asterisk and run in segments, results are in the log
//------------------------------------------------------------------------------    
zz=Like("JSL COMPANION VERSION 2", "%SL%MP%ION%");
show(zz);
zz=Like("JSL COMPANION VERSION 2", "SL%MP%ION%");  //will not match
show(zz);
zz=Like("JSL COMPANION VERSION 2", "JSL%");  //starts with
show(zz);
zz=Like("JSL COMPANION VERSION 2", "jsl%");  //starts with not case sensitive
show(zz);
zz=Like("EDGE INFILM", "%fil%");  //ends with not case sensitive
show(zz);
zz=Like("EDGE INFILM", "%fil");  //will not match
show(zz);
//when the SourceStr has the %, the SearchStr, must use %% for the single % 
//function needs at least one character before the occurence of  "%" and 1 after.
zz=Like("This is a % test and a * test.", "%a %%%d%*%");  //contains
show(zz);

//===============Apply Like to each item in a list using Lfunc
//create xxList a test case
xxList = {"CAD.DFM_COUNT (GROUP=C1;LOC=CENTER)(MODULE=*)(X-BAR)", "CAD.DFM_COUNT (GROUP=C2;LOC=CENTER)(MODULE=*)(X-BAR)",
"CAD.DFM_COUNT (GROUP=C3;LOC=CENTER)(MODULE=*)(X-BAR)", "CAD.DFM_COUNT (GROUP=C1;LOC=EDGE)(MODULE=*)(X-BAR)",
"CAD.DFM_COUNT (GROUP=C2;LOC=EDGE)(MODULE=*)(X-BAR)", "CAD.DFM_COUNT (GROUP=C3;LOC=EDGE)(MODULE=*)(X-BAR)",
"SUM.CAD.DFM_%GROUP (GROUP=C1;LOC=CENTER)(MODULE=*)(X-BAR)", "SUM.CAD.DFM_%GROUP (GROUP=C2;LOC=CENTER)(MODULE=*)(X-BAR)",
"SUM.CAD.DFM_%GROUP (GROUP=C3;LOC=CENTER)(MODULE=*)(X-BAR)", "SUM.CAD.DFM_%GROUP (GROUP=C1;LOC=EDGE)(MODULE=*)(X-BAR)",
"SUM.CAD.DFM_%GROUP (GROUP=C2;LOC=EDGE)(MODULE=*)(X-BAR)", "SUM.CAD.DFM_%GROUP (GROUP=C3;LOC=EDGE)(MODULE=*)(X-BAR)"};

//will report the length of all that items that match pattern
zz= Lfunc(xxList, expr( length(::Like(x,"%C1%")) ) );
show(zz);

//will find the list index for each match using a variable ref
sstr = "%C1%";
zz= Loc(Matrix(Lfunc(xxList, expr( length(::Like(x,sstr)) ) ) )>0);
show(zz, if(nrow(zz)>0,xxList[zz],{} ));

//will find the list index for each item that contains a %GROUP
sstr = "%_%%GROUP%";
zz= Loc(Matrix(Lfunc(xxList, expr( length(::Like(x,sstr)) ) ) )>0);
show(zz, if(nrow(zz)>0,xxList[zz],{} ));
----- end test */

Try running some of the commented test cases to get a feel for how they work and whether they will meet your needs.

 

Good luck!

powerpuff
Level IV

Re: Compare lists, but partial strings

@uday_guntupalli How can I rename all the values in list a1 with the word "Final" for the words to look like the column names in the table?

for(i = 1, i <= N Items(a1), i++,
c = "Final " || a1[i];
show(c);

);

Thank you.

uday_guntupalli
Level VIII

Re: Compare lists, but partial strings

@powerpuff
        Let us say your table looks like what is shown below. 

image.png

      

     You can use the following piece of code to update the column names by appending "Final" or any other word as shown below. You were on the right track, the only thing you needed to add was, getting a reference for the column and add a set name command 

 

dt = current data table(); 

ColNames = dt << Get Column Names("String"); 

for(i = 1, i <= N Items(ColNames), i++,
		Col = Column(dt,i); // get reference for column 
		Col << Set Name("Final " || ColNames[i]); 
   );

 

Best
Uday
powerpuff
Level IV

Re: Compare lists, but partial strings

Thanks Uday @uday_guntupalli 

One last question: If I have 2 lists
a1 = {Final Output, Final Name, Final Place, Final Animal, Final Thing } and
ColNames = {Output, Name, Place, Animal, Thing, Final Output, Final Name, Final Place, Final Animal, Final Thing }
How can I compare and find if the values in the list match? The length of these lists are also unequal and I can't seem to figure out a way to loop it around.
Something like:
if(ColNames[i] == a1[],
show(ColNames[i]);
);

Thanks very much.

uday_guntupalli
Level VIII

Re: Compare lists, but partial strings

@powerpuff
    There is a cool way that I learned from @Craige_Hales to find common elements between lists. 


List1 = {"Final Output","Final Name","Final Place","Final Animal","Final Thing"}; List2 = {"Output","Name","Place","Animal","Thing","Final Output","Final Name","Final Place", "Final Animal","Final Thing"}; AA1 = Associative Array(List1); AA2 = Associative Array(List2); Intersection = AA1 ; Intersection << Intersect(AA2); print("common (intersection) items:",Intersection<<getkeys);

(https://community.jmp.com/t5/Discussions/Quick-way-to-compare-two-lists-and-identify-the-uncommon/td...

Best
Uday
powerpuff
Level IV

Re: Compare lists, but partial strings

Thanks very much @uday_guntupalli that was very helpful.