cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
uday_guntupalli
Level VIII

Search For Strings Using Partial String

All,

     I am trying to search for string matches with partial  string as the search criteria . I have tried multiple ways as I show below - I believe there must be a better way using RegEx or Pat Match to do this - but I am unable to get it right. Any help will be appreciated .


For the Sake of an example , I am using Air Traffic Sample Data table :

// Method # 1 Using Pat Match

dt  = Current Data Table();

ColNames = dt << Get COlumn Names();

Match = list();

sp = Pat Span("ID");

Pat Match(ColNames,sp>>Match);

Match; 

// This yields no matches . However in the same code - if ID itself was a column name - this works . So I want to understand how to use part of the search string to do the search .

// Method # 2 Using Contains()

for(     i=1, i<= N Items(ColNames),i++,

                 If(Contains(ColNames,"ID")== 1,

                           Insert Into(Match,i);

                   );

   );

Best
uday

Best
Uday
1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: Search For Strings Using Partial String

maybe like this:

dt  = Current Data Table();

ColNames = dt << Get COlumn Names();

colstrings = char(colnames);

match = regex(colstrings, "[{,]\s*([^{,}]*ID[^{,}]*)[,}]", "\1");

show(Match);

Match = "Flight ID";

which depends on the column names not using commas or curly braces.  The char function turns the list of names into a string:

{Airline, Carrier Code, Flight Number, Tail Number, Flight ID, Event, Airport, Time,

Original Time, Longitude, Latitude}

becomes

"{Airline, Carrier Code, Flight Number, Tail Number, Flight ID, Event, Airport, Time, Original Time, Longitude, Latitude}"

I chose regex for this because...well it won't be beautiful either way.  Breaking down the regex pattern:

"[{,]     either an open curly brace or a comma

\s*      optional white space

([^{,}]*ID[^{,}]*)    the outer parens are the \1 match (more below)

[,}]"     either a comma or a close curly brace

breaking down the \1 part:

[^{,}]*     zero or more characters that are not curly braces or commas

ID          the word you are hunting in a column name

[^{,}]*     zero or more characters that are not curly braces or commas


the third arg to regex, "\1", means use the first open paren to identify the replacement text.

Craige

View solution in original post

3 REPLIES 3
Craige_Hales
Super User

Re: Search For Strings Using Partial String

maybe like this:

dt  = Current Data Table();

ColNames = dt << Get COlumn Names();

colstrings = char(colnames);

match = regex(colstrings, "[{,]\s*([^{,}]*ID[^{,}]*)[,}]", "\1");

show(Match);

Match = "Flight ID";

which depends on the column names not using commas or curly braces.  The char function turns the list of names into a string:

{Airline, Carrier Code, Flight Number, Tail Number, Flight ID, Event, Airport, Time,

Original Time, Longitude, Latitude}

becomes

"{Airline, Carrier Code, Flight Number, Tail Number, Flight ID, Event, Airport, Time, Original Time, Longitude, Latitude}"

I chose regex for this because...well it won't be beautiful either way.  Breaking down the regex pattern:

"[{,]     either an open curly brace or a comma

\s*      optional white space

([^{,}]*ID[^{,}]*)    the outer parens are the \1 match (more below)

[,}]"     either a comma or a close curly brace

breaking down the \1 part:

[^{,}]*     zero or more characters that are not curly braces or commas

ID          the word you are hunting in a column name

[^{,}]*     zero or more characters that are not curly braces or commas


the third arg to regex, "\1", means use the first open paren to identify the replacement text.

Craige
uday_guntupalli
Level VIII

Re: Search For Strings Using Partial String

Craige@JMP​ :
    While the solution you provided works - if I may I have 2 follow up questions :

1. How would you get the indices when there are matches found ?

2. If the column names look like this - which they do in my case

Name( "IPhone 7, California, Yes" )  - ColNames
Name(\!"IPhone 7, California, Yes\!") - ColString

    Please advice .

Best

Uday

Best
Uday
Craige_Hales
Super User

Re: Search For Strings Using Partial String

Sounds like an explicit loop will be easier and more clear.

for( icol = 1, icol <= nitems(colnames), icol++,

// use regex on char(colnames) and break out of this loop when you get the match you need.

)

Since the column names DO have commas, and you need the icol index, and the regex can be a lot simpler.  This isn't tested, but something like

if( ! ismissing( regex( char(colnames[ icol ]), "ID" ) ), break() );

would test each icol for ID in the name.  regex returns missing if the match fails.

Craige