BookmarkSubscribe
Choose Language Hide Translation Bar
john_madden
Community Trekker

Regex issue with escaped characters

I'm having a problem with regex and bracket characters.

This JSL snippet:

Show(Regex("abcdef[ghi]", "([^\[]*)\[([^\]]*)\]", "\1{\2}"))

doesn't work in JMP (14.1). It gives the following error:

Unexpected "([^]*)\[([^\]]*)\]", "\1{\2}"))". Perhaps there is a missing "," or ")".
Trying to parse arguments of function "Regex".

However, this is valid regex in my Perl regex tester. ( I use Patterns on OS X.) In that environment, inputting:

abcdef[ghi]

to the Regular Expression

([^\[]*)\[([^\]]*)\]

using the Replacement value of

$1{$2}

yields the result

abcdef{ghi}

(The Perl expression is

$searchText =~ s/([^\[]*)\[([^\]]*)\]/$1{$2}/gism;

).

What is going on? Does JMP regex engine need some other way of escaping bracket characters?

 

Thanks.

 

0 Kudos
3 ACCEPTED SOLUTIONS

Accepted Solutions
pmroz
Super User

Re: Regex issue with escaped characters

This will do what you want.  Had to escape the two escape sequences!

Show(Regex("abcdef[ghi]", "([^\[\[]\]*)\[\[]\([^\]]*)\]", "\1{\2}"));
Regex("abcdef[ghi]", "([^\!\[]*)\!\[([^\]]*)\]", "\1{\2}") = "abcdef{ghi}";
Highlighted
john_madden
Community Trekker

Re: Regex issue with escaped characters

Aha! Thank you both pmroz and Mark. With some fiddling, I figured out how to do what I want.
1. For the benefit of the JSL interpreter, first escape the entire regex string using \[…]\.
2. Then, for the benefit of the Regex engine, escape the bracket characters as you would in Perl.
The result is:

Show(Regex("abcdef[ghi]", "\[([^\[]*)\[([^\]]*)\]]\", "\1{\2}"));

which yields the expected result of;

abcdef{ghi}

Breaking down the string we have:

1.      \[                                   for JSL
2.      [([^\[]*)\[([^\]]*)\]          for the Regex engine (just as in Perl)
3.      ]\                                   for JSL

Phew!!!

john_madden
Community Trekker

Re: Regex issue with escaped characters

As a follow up, I learned something I didn't know about Perl regex, and that is that the engine is smart enough to know that "[^[]" means "everything but "[". That is to say, the engine is clever enough not to insist on a backslash before the second "[" in this particular context. It also knows how to interpret "[^abc[]" correctly.

It appears that the engine in JMP understands this as well.

So I was making more trouble for myself than I needed to by escaping the "[". 

0 Kudos
9 REPLIES 9
john_madden
Community Trekker

Re: Regex issue with escaped characters

 
john_madden
Community Trekker

Re: Regex issue with escaped characters

P.S. Weirdly, this works fine in JMP:

Show(Regex("abcdef(ghi)", "([^(]*)\(([^\)]*)\)", "\1[\2]"))

yielding the correct result:

Regex("abcdef(ghi)", "([^(]*)\(([^\)]*)\)", "\1[\2]") = "abcdef[ghi]";
pmroz
Super User

Re: Regex issue with escaped characters

I wonder if it's because you have the string escape character sequence "\[" in your string.  That sequence means treat everything afterwards as a string until you hit "]\".

"([^\[]*)\[([^\]]*)\]"

 

pmroz
Super User

Re: Regex issue with escaped characters

This will do what you want.  Had to escape the two escape sequences!

Show(Regex("abcdef[ghi]", "([^\[\[]\]*)\[\[]\([^\]]*)\]", "\1{\2}"));
Regex("abcdef[ghi]", "([^\!\[]*)\!\[([^\]]*)\]", "\1{\2}") = "abcdef{ghi}";
john_madden
Community Trekker

Re: Regex issue with escaped characters

Interesting thought. In Perl \ escapes only the next character.

But if this is different in JMP, then this should work:

Show(Regex("abcdef[ghi]", "([^\[\]*)\[\([^\]\].*)\]\", "\1{\2}”))

Unfortunately, it doesn’t:

Regex: expecting ')' at position 21 in pattern '([^\]*)\[\([^\].*)\]\'. in access or evaluation of 'Regex' , Regex/*###*/("abcdef[ghi]", "([^\]*)\!\[\([^\].*)\]\", "\1{\2}")
0 Kudos
Craige_Hales
Staff (Retired)

Re: Regex issue with escaped characters

@pmroz is correct. JMP made an unfortunate choice, long before the regex support was added, to allow the \[ ... ]\ anywhere in a string to mean ignore all special character meanings in between. To get the literal text \[ requires using JMP's other escape mechanism \!X, replacing X with \ , to represent a \ that doesn't combine with the [ when JMP is parsing the JSL. Later, when regex parses the pattern, the \ is followed by the [ which in turn tells regex to ignore the special meaning of [ and treat it as a simple character.

I'm thinking about how to make this easier, thanks for the report!

 

Craige

Re: Regex issue with escaped characters

The backslash character is used to escape the next character, in general, just like the Perl flavor but it also denotes special character sets such as \s for space characters or \d for digits. You would not succeed if you wanted to enter s by entering \s in the regex. The escape sequence \[ is special and it is paired with the closing ]\ sequence. That is, it is not the way to include the literal ] character. It will instead include the string up to the closing escape sequence.

 

See this example in exploded form (four capturing groups) that illustrates how they work:

 

Regex(
	"abcdef[ghi]",
	"([^\[\[]\]*)" || "(\[\[]\)" || "([^\[\[]\]*)" || "(\[\]]\)",
	"\1-\2-\3-\4"
);
Learn it once, use it forever!
0 Kudos
Highlighted
john_madden
Community Trekker

Re: Regex issue with escaped characters

Aha! Thank you both pmroz and Mark. With some fiddling, I figured out how to do what I want.
1. For the benefit of the JSL interpreter, first escape the entire regex string using \[…]\.
2. Then, for the benefit of the Regex engine, escape the bracket characters as you would in Perl.
The result is:

Show(Regex("abcdef[ghi]", "\[([^\[]*)\[([^\]]*)\]]\", "\1{\2}"));

which yields the expected result of;

abcdef{ghi}

Breaking down the string we have:

1.      \[                                   for JSL
2.      [([^\[]*)\[([^\]]*)\]          for the Regex engine (just as in Perl)
3.      ]\                                   for JSL

Phew!!!

john_madden
Community Trekker

Re: Regex issue with escaped characters

As a follow up, I learned something I didn't know about Perl regex, and that is that the engine is smart enough to know that "[^[]" means "everything but "[". That is to say, the engine is clever enough not to insist on a backslash before the second "[" in this particular context. It also knows how to interpret "[^abc[]" correctly.

It appears that the engine in JMP understands this as well.

So I was making more trouble for myself than I needed to by escaping the "[". 

0 Kudos