Our World Statistics Day conversations have been a great reminder of how much statistics can inform our lives. Do you have an example of how statistics has made a difference in your life? Share your story with the Community!
Choose Language Hide Translation Bar
Highlighted
peder
Level III

pat match

Hi, I have a text string that I need to match, and I've been trying pat match() but don't have the grasp of it. The text is a filename. I need to identify all files with a certain pattern:

 

"AB0007 25092019 0822.txt":

2 capital letters. "AB" actually always.

4 digits and a space

ddMMYYYY (or just 8 digits) and a space

4 digits

".txt"

 

I've been trying but failed with many versions, latest:

file="AB0007 25092019 0822.txt" // for testing; eventually it will be an item in a file list that I cycle through

pattern="AB"+"\d{4}"+"\s"+"\d{8}"+"\s"+"\d{4}";
pat match(file,pattern);

 

Any help would be appreciated!

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Craige_Hales
Staff (Retired)

Re: pat match

JSL has a regex function that does traditional regex matches and a patmatch function that can use snippets of regex if desired. The last example hints why you might want to use the patmatch function.  Also notice the extra bit of either ^...$ or PatPos(0)...PatRPos(0) logic to make sure the entire filename was matched.

 

file = "AB0007 25092019 0822.txt"; // for testing; eventually it will be an item in a file list that I cycle through

// decide if using (1) pure regex,
Show( Regex( file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$" ) ); // "AB0007 25092019 0822.txt", or missing (.) if not matched

// or (2) pure pattern matching,
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or (3a) mix of both
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{8}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or collapse the mix (3b) a bit
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}\s\d{8}\s\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// why you might choose (2) or (3a) -- you can capture substrings like the date part using the >> operator:
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) >> datepart + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ),informat(datepart,"ddmmyyyy") ); // rc is 1, Informat(datepart, "ddmmyyyy") = 25Sep2019;

Regex(file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$") = "AB0007 25092019 0822.txt";
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Informat(datepart, "ddmmyyyy") = 25Sep2019;

 

JMP's regex is built on top of the pattern matching functions; neither one will be faster than the other. The PatRegex function translates the regex into equivalent pattern matching functions so PatMatch will work.

Craige

View solution in original post

2 REPLIES 2
Highlighted
Craige_Hales
Staff (Retired)

Re: pat match

JSL has a regex function that does traditional regex matches and a patmatch function that can use snippets of regex if desired. The last example hints why you might want to use the patmatch function.  Also notice the extra bit of either ^...$ or PatPos(0)...PatRPos(0) logic to make sure the entire filename was matched.

 

file = "AB0007 25092019 0822.txt"; // for testing; eventually it will be an item in a file list that I cycle through

// decide if using (1) pure regex,
Show( Regex( file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$" ) ); // "AB0007 25092019 0822.txt", or missing (.) if not matched

// or (2) pure pattern matching,
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or (3a) mix of both
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{8}" ) + Pat Regex( "\s" ) + Pat Regex( "\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// or collapse the mix (3b) a bit
pattern = Pat Pos( 0 ) + "AB" + Pat Regex( "\d{4}\s\d{8}\s\d{4}" ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ) ); // rc is 1, it matches, 0 if fails to match

// why you might choose (2) or (3a) -- you can capture substrings like the date part using the >> operator:
digit = Pat Any( "0123456789" );
pattern = Pat Pos( 0 ) + "AB" + Pat Repeat( digit, 4, 4 ) + " " + Pat Repeat( digit, 8, 8 ) >> datepart + " " + Pat Repeat( digit, 4, 4 ) + ".txt" + Pat R Pos( 0 );
Show( Pat Match( file, pattern ),informat(datepart,"ddmmyyyy") ); // rc is 1, Informat(datepart, "ddmmyyyy") = 25Sep2019;

Regex(file, "^AB\d{4}\s\d{8}\s\d{4}\.txt$") = "AB0007 25092019 0822.txt";
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Pat Match(file, pattern) = 1;
Informat(datepart, "ddmmyyyy") = 25Sep2019;

 

JMP's regex is built on top of the pattern matching functions; neither one will be faster than the other. The PatRegex function translates the regex into equivalent pattern matching functions so PatMatch will work.

Craige

View solution in original post

Highlighted
peder
Level III

Re: pat match

Not 1, not 2, but 5 solutions. Thank you so much!
Article Labels

    There are no labels assigned to this post.