cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Shlomizr
Level I

how to extract string between specific pattern using regexp?


Hi,
I am new to regex, i need to extract the name of users from their id, for example:
I have the following long string
123450_dan; 154396_kelli; 198756_dev;
Result need to be: dan
What regex do i need to use in order to extract the name dan which have the id 123450?

Thanks in advance.
3 ACCEPTED SOLUTIONS

Accepted Solutions

Re: how to extract string between specific pattern using regexp?

You did not specifiy much detail about the ID so here are a few examples to get started. I used a JMP script, but this extraction could be used in a column formula, too.

 

Names Default to Here( 1 );

// test string
string = "123450_dan; 154396_kelli; 198756_dev;";

// assume specific ID
name = Regex( string, "123450_(\w+\b)", "\1" );
Show( name );

// assume first ID with 6 numbers
name = Regex( string, "\d{6}_(\w+\b)", "\1" );
Show( name );

// assume first ID with any number of numbers
name = Regex( string, "\d+_(\w+\b)", "\1" );
Show( name );

View solution in original post

gzmorgan0
Super User (Alumni)

Re: how to extract string between specific pattern using regexp?

The script that I provided previously using JMP functions would not require any modifications.  The regular expression to capture one or more "names" after the underscore would be (\w+\s*)+ which means accumulate ( ) at  least one word \w+, zero or more spaces \s* and the trailing plus implies repeat the pattern one or more times.

 

Names Default to Here(1);
string="123450_dan fox; 154396_kelli ann marie; 198756_dev;";

name1 = Regex( string, "123450_((\w+\s*)+\b)", "\1" );
name2 = Regex( string, "154396_((\w+\s*)+\b)", "\1" );

Show( name1, name2 ); 
/*: name1 = "dan fox"; name2 = "kelli ann marie"; */

View solution in original post

Re: how to extract string between specific pattern using regexp?

Use this form instead:

 

name = Regex( string, "123450_([a-zA-Z ]+);", "\1" );

View solution in original post

9 REPLIES 9

Re: how to extract string between specific pattern using regexp?

You did not specifiy much detail about the ID so here are a few examples to get started. I used a JMP script, but this extraction could be used in a column formula, too.

 

Names Default to Here( 1 );

// test string
string = "123450_dan; 154396_kelli; 198756_dev;";

// assume specific ID
name = Regex( string, "123450_(\w+\b)", "\1" );
Show( name );

// assume first ID with 6 numbers
name = Regex( string, "\d{6}_(\w+\b)", "\1" );
Show( name );

// assume first ID with any number of numbers
name = Regex( string, "\d+_(\w+\b)", "\1" );
Show( name );
Shlomizr
Level I

Re: how to extract string between specific pattern using regexp?

Thanks Mark,
Working perfect!!
Shlomizr
Level I

Re: how to extract string between specific pattern using regexp?

Hi Mark,
Just 1 question, what do i need to add if i have space in the user name:

for example: "123450_dan fox; 154396_kelli; 198756_dev;"
I tried the following Regex:

string="123450_dan fox; 154396_kelli; 198756_dev;"

name = Regex( string, "123450_(\w+ | \s+)", "\1" );
Show( name );

Result: "dan" instead of "dan fox"

Thanks in advance.

Re: how to extract string between specific pattern using regexp?

Use this form instead:

 

name = Regex( string, "123450_([a-zA-Z ]+);", "\1" );
Shlomizr
Level I

Re: how to extract string between specific pattern using regexp?

Thanks, work perfect.

Re: how to extract string between specific pattern using regexp?

Assuming that you want fhe name associated with a specific ID in the parttern "id_name;", then this version will work, too.

 

Names Default to Here( 1 );

// test string
string = "123450_dan; 154396_kelli; 198756_dev;";

// target ID
id = "123450";

// assume specific ID
name = Regex( string, id || "_(\w+);", "\1" );
Show( name );
gzmorgan0
Super User (Alumni)

Re: how to extract string between specific pattern using regexp?

Just for fun, and because regular expressions can be tough to read by others (or two weeks later), Here is an alternative, using JMP functions.

 

Names Default to Here( 1 );

// test string
string = "123450_dan; 154396_kelli; 198756_dev;";

//create a list of id_name
id = Words(string, ";");

//create a list if ids, a list if names, an associative array(keyed list) 
idlist={};
namelist={};
lookup = [=>""];  //Associative Array that retuns an empty string, if number is not valid

for(i=1, i<=nitems(id), i++,
   InsertInto(idlist, num(word(1,Trim(id[i]),"_")) );
   InsertInto(namelist, Titlecase(word(2,Trim(id[i]),"_")) ); //don't need TitleCase
   lookup[idlist[i]]=namelist[i]  //ids are the keys and names are the values
);
show(id, idlist, namelist, lookup, lookup[198756], lookup[128954]); //last one is not valid

 

gzmorgan0
Super User (Alumni)

Re: how to extract string between specific pattern using regexp?

The script that I provided previously using JMP functions would not require any modifications.  The regular expression to capture one or more "names" after the underscore would be (\w+\s*)+ which means accumulate ( ) at  least one word \w+, zero or more spaces \s* and the trailing plus implies repeat the pattern one or more times.

 

Names Default to Here(1);
string="123450_dan fox; 154396_kelli ann marie; 198756_dev;";

name1 = Regex( string, "123450_((\w+\s*)+\b)", "\1" );
name2 = Regex( string, "154396_((\w+\s*)+\b)", "\1" );

Show( name1, name2 ); 
/*: name1 = "dan fox"; name2 = "kelli ann marie"; */
Shlomizr
Level I

Re: how to extract string between specific pattern using regexp?

Thanks, work perfect!!