Subscribe Bookmark RSS Feed

can I use Contains() within Match()

jmpbeginner

Community Trekker

Joined:

Sep 11, 2013

I need to search a character column to look to see if each rows value contains specific strings and then create an indicator name/group accordingly in the new column (it will be searching for ~14 specific strings).  I'm pretty sure nested IF() would be pretty slow...  I have used Match() in the past when I want to match exact strings, but can I use wildcards?

I tried * and % but to no avail:

 

Match( :columnOfInterest,
     "*StringA*", "String A",
     "*StringB*", "String B",
     "" )

I then tried to use Contains() within the Match() and still no luck:

 

Match( :columnOfInterest,
     Contains(:columnOfInterest, "StringA"), "String A",
     Contains(:columnOfInterest, "StringB"), "String B",
     "" )

I see that Contains returns a numeric so I also tried adding > comparison and even wrapping in NUM():

 

Match( :columnOfInterest,
     Contains(:columnOfInterest, "StringA") > 0, "String A",
     Contains(:columnOfInterest, "StringB") > 0, "String B",
     "" )

and

 

Match( :columnOfInterest,
     Num(Contains(:columnOfInterest, "StringA")) > 0, "String A",
     Num(Contains(:columnOfInterest, "StringB")) > 0, "String B",
     "" ) 

any help is greatly appreciated.  thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

Regex is the way to do, here is a very simple example that will do the multiple matching you want.  Just expand it with the strings you are looking for.  It will return the match it finds.

 

 

names default to here(1);
 
test="Answer is ABC";
TheMatch = regex(test,"(ABC|DEF|GHI)","\1");

 

Jim
7 REPLIES
Phil_Brown

Super User

Joined:

Mar 20, 2012

JMPbeginner

I would avoid conditional statements altogether. A better solution is to use Regex. For example, suppose your search terms were car models, "Accord", "Civic", etc, a Regex statement can be written that says -"Match ANY of the terms". In JSL, a column formula for this would look like:

Regex( :columnOfInterest,  "(Accord|Civic|Corolla|Dakota|Integra|Maxima|Ranger|S10)" );

Attached is a simple demo file of this. Regex is extremely powerful at string pattern matching. There are numerous options for matching partial words, word stems, etc. Hope this help!

-PBZ

PDB
Solution

Regex is the way to do, here is a very simple example that will do the multiple matching you want.  Just expand it with the strings you are looking for.  It will return the match it finds.

 

 

names default to here(1);
 
test="Answer is ABC";
TheMatch = regex(test,"(ABC|DEF|GHI)","\1");

 

Jim
jmpbeginner

Community Trekker

Joined:

Sep 11, 2013

this all makes sense, now I'm going to try to complicate things...

 

regex(test,"(ABC|DEF|GHI)","\1")

can I return a different string than what is matched?

  • if "ABC" is found, return "group1",
  • if "DEF" is found, return "group2", etc

 

or is that not possible with regex?

Phil_Brown

Super User

Joined:

Mar 20, 2012

JMPbeginner

12586_Screen Shot 2016-08-23 at 7.34.19 PM.png

 

You could use an Associative Array to store a lookup table of the strings you want to map to different strings.

 

For example, above :searchTerms and :mappedTerms are the keys and values of such an associative array. In the code below, this array is assigned to the variable "lookup". In the final formula for the RESULTS-mapped column, you'll see that every match found by the Regex serves as the argument for the lookup array.

 

e.g. Take row 1 in the above table. The found result is "Integra" and the mapped result is lookup["Accord"] which is "GroupA"

 

fmlaExpr = Expr(
     If( Row() == 1,
          lookUp = Associative Array( :searchTerms, :mappedTerms, Empty() );
          vals = :searchTerms << get values;
          doThis = Expr(
               lookUp[Regex( :ColumnOfInterest, Char( "(" || Concat Items( vals, "|" ) || ")" ) )]
          );
          doThis;
        ,
          doThis
     )
);
Eval( Eval Expr( :RESULTS-mapped << set formula( Expr( Parse( Char( NameExpr(fmlaExpr)) ) ) ) ) );

 

Attached is the complete example table.

PDB
vkessler

Community Trekker

Joined:

Dec 23, 2015

This solved my problem. Totally forgot about associative arrays. Thanks a lot!
jmpbeginner

Community Trekker

Joined:

Sep 11, 2013

thanks PBZ and Jim, Regex() works great and now I have a new tool to use while scripting! much appreciated!

Phil_Brown

Super User

Joined:

Mar 20, 2012

Note that the "\1" doesn't actually make a difference since we only have 1 pattern. If there were multiple parentheses, then indeed one could have "\1" or "\n" where n is <= number of patterns. See JMP online doc ==> Regular Expressions

PDB