- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Regex issue with escaped characters
I'm having a problem with regex and bracket characters.
This JSL snippet:
Show(Regex("abcdef[ghi]", "([^\[]*)\[([^\]]*)\]", "\1{\2}"))
doesn't work in JMP (14.1). It gives the following error:
Unexpected "([^]*)\[([^\]]*)\]", "\1{\2}"))". Perhaps there is a missing "," or ")".
Trying to parse arguments of function "Regex".
However, this is valid regex in my Perl regex tester. ( I use Patterns on OS X.) In that environment, inputting:
abcdef[ghi]
to the Regular Expression
([^\[]*)\[([^\]]*)\]
using the Replacement value of
$1{$2}
yields the result
abcdef{ghi}
(The Perl expression is
$searchText =~ s/([^\[]*)\[([^\]]*)\]/$1{$2}/gism;
).
What is going on? Does JMP regex engine need some other way of escaping bracket characters?
Thanks.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
This will do what you want. Had to escape the two escape sequences!
Show(Regex("abcdef[ghi]", "([^\[\[]\]*)\[\[]\([^\]]*)\]", "\1{\2}"));
Regex("abcdef[ghi]", "([^\!\[]*)\!\[([^\]]*)\]", "\1{\2}") = "abcdef{ghi}";
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
Aha! Thank you both pmroz and Mark. With some fiddling, I figured out how to do what I want.
1. For the benefit of the JSL interpreter, first escape the entire regex string using \[…]\.
2. Then, for the benefit of the Regex engine, escape the bracket characters as you would in Perl.
The result is:
Show(Regex("abcdef[ghi]", "\[([^\[]*)\[([^\]]*)\]]\", "\1{\2}"));
which yields the expected result of;
abcdef{ghi}
Breaking down the string we have:
1. \[ for JSL
2. [([^\[]*)\[([^\]]*)\] for the Regex engine (just as in Perl)
3. ]\ for JSL
Phew!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
As a follow up, I learned something I didn't know about Perl regex, and that is that the engine is smart enough to know that "[^[]" means "everything but "[". That is to say, the engine is clever enough not to insist on a backslash before the second "[" in this particular context. It also knows how to interpret "[^abc[]" correctly.
It appears that the engine in JMP understands this as well.
So I was making more trouble for myself than I needed to by escaping the "[".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
P.S. Weirdly, this works fine in JMP:
Show(Regex("abcdef(ghi)", "([^(]*)\(([^\)]*)\)", "\1[\2]"))
yielding the correct result:
Regex("abcdef(ghi)", "([^(]*)\(([^\)]*)\)", "\1[\2]") = "abcdef[ghi]";
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
I wonder if it's because you have the string escape character sequence "\[" in your string. That sequence means treat everything afterwards as a string until you hit "]\".
"([^\[]*)\[([^\]]*)\]"
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
This will do what you want. Had to escape the two escape sequences!
Show(Regex("abcdef[ghi]", "([^\[\[]\]*)\[\[]\([^\]]*)\]", "\1{\2}"));
Regex("abcdef[ghi]", "([^\!\[]*)\!\[([^\]]*)\]", "\1{\2}") = "abcdef{ghi}";
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
But if this is different in JMP, then this should work:
Show(Regex("abcdef[ghi]", "([^*)\[\([^\]\].*)\]\", "\1{\2}”))
Unfortunately, it doesn’t:
Regex: expecting ')' at position 21 in pattern '([^\]*)\[\([^\].*)\]\'. in access or evaluation of 'Regex' , Regex/*###*/("abcdef[ghi]", "([^\]*)\!\[\([^\].*)\]\", "\1{\2}")
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
@pmroz is correct. JMP made an unfortunate choice, long before the regex support was added, to allow the \[ ... ]\ anywhere in a string to mean ignore all special character meanings in between. To get the literal text \[ requires using JMP's other escape mechanism \!X, replacing X with \ , to represent a \ that doesn't combine with the [ when JMP is parsing the JSL. Later, when regex parses the pattern, the \ is followed by the [ which in turn tells regex to ignore the special meaning of [ and treat it as a simple character.
I'm thinking about how to make this easier, thanks for the report!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
The backslash character is used to escape the next character, in general, just like the Perl flavor but it also denotes special character sets such as \s for space characters or \d for digits. You would not succeed if you wanted to enter s by entering \s in the regex. The escape sequence \[ is special and it is paired with the closing ]\ sequence. That is, it is not the way to include the literal ] character. It will instead include the string up to the closing escape sequence.
See this example in exploded form (four capturing groups) that illustrates how they work:
Regex(
"abcdef[ghi]",
"([^\[\[]\]*)" || "(\[\[]\)" || "([^\[\[]\]*)" || "(\[\]]\)",
"\1-\2-\3-\4"
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
Aha! Thank you both pmroz and Mark. With some fiddling, I figured out how to do what I want.
1. For the benefit of the JSL interpreter, first escape the entire regex string using \[…]\.
2. Then, for the benefit of the Regex engine, escape the bracket characters as you would in Perl.
The result is:
Show(Regex("abcdef[ghi]", "\[([^\[]*)\[([^\]]*)\]]\", "\1{\2}"));
which yields the expected result of;
abcdef{ghi}
Breaking down the string we have:
1. \[ for JSL
2. [([^\[]*)\[([^\]]*)\] for the Regex engine (just as in Perl)
3. ]\ for JSL
Phew!!!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex issue with escaped characters
As a follow up, I learned something I didn't know about Perl regex, and that is that the engine is smart enough to know that "[^[]" means "everything but "[". That is to say, the engine is clever enough not to insist on a backslash before the second "[" in this particular context. It also knows how to interpret "[^abc[]" correctly.
It appears that the engine in JMP understands this as well.
So I was making more trouble for myself than I needed to by escaping the "[".