cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
peder
Level III

pat match

Hi, I have a text string that I need to match, and I've been trying pat match() but don't have the grasp of it. The text is a filename. I need to identify all files with a certain pattern:

 

"AB0007 25092019 0822.txt":

2 capital letters. "AB" actually always.

4 digits and a space

ddMMYYYY (or just 8 digits) and a space

4 digits

".txt"

 

I've been trying but failed with many versions, latest:

file="AB0007 25092019 0822.txt" // for testing; eventually it will be an item in a file list that I cycle through

pattern="AB"+"\d{4}"+"\s"+"\d{8}"+"\s"+"\d{4}";
pat match(file,pattern);

 

Any help would be appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: pat match

JSL has a regex function that does traditional regex matches and a patmatch function that can use snippets of regex if desired. The last example hints why you might want to use the patmatch function.  Also notice the extra bit of either ^...$ or PatPos(0)...PatRPos(0) logic to make sure the entire filename was matched.

 

file = "AB0007 25092019 0822.txt"; // for testing; eventually it will be an item in a file list that I cycle through

// decide if using (1) pure regex,
Show( Regex( file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$" ) ); // "AB0007 25092019 0822.txt", or missing (.) if not matched

// or (2) pure pattern matching,
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or (3a) mix of both
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{8}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or collapse the mix (3b) a bit
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}\s\d{8}\s\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// why you might choose (2) or (3a) -- you can capture substrings like the date part using the >> operator:
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) >> datepart + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ),informat(datepart,"ddmmyyyy") ); // rc is 1, Informat(datepart, "ddmmyyyy") = 25Sep2019;

Regex(file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$") = "AB0007 25092019 0822.txt";
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Informat(datepart, "ddmmyyyy") = 25Sep2019;

 

JMP's regex is built on top of the pattern matching functions; neither one will be faster than the other. The PatRegex function translates the regex into equivalent pattern matching functions so PatMatch will work.

Craige

View solution in original post

2 REPLIES 2
Craige_Hales
Super User

Re: pat match

JSL has a regex function that does traditional regex matches and a patmatch function that can use snippets of regex if desired. The last example hints why you might want to use the patmatch function.  Also notice the extra bit of either ^...$ or PatPos(0)...PatRPos(0) logic to make sure the entire filename was matched.

 

file = "AB0007 25092019 0822.txt"; // for testing; eventually it will be an item in a file list that I cycle through

// decide if using (1) pure regex,
Show( Regex( file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$" ) ); // "AB0007 25092019 0822.txt", or missing (.) if not matched

// or (2) pure pattern matching,
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or (3a) mix of both
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{8}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or collapse the mix (3b) a bit
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}\s\d{8}\s\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// why you might choose (2) or (3a) -- you can capture substrings like the date part using the >> operator:
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) >> datepart + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ),informat(datepart,"ddmmyyyy") ); // rc is 1, Informat(datepart, "ddmmyyyy") = 25Sep2019;

Regex(file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$") = "AB0007 25092019 0822.txt";
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Informat(datepart, "ddmmyyyy") = 25Sep2019;

 

JMP's regex is built on top of the pattern matching functions; neither one will be faster than the other. The PatRegex function translates the regex into equivalent pattern matching functions so PatMatch will work.

Craige
peder
Level III

Re: pat match

Not 1, not 2, but 5 solutions. Thank you so much!