- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Regex help
All,
Trying to explore Regular Expressions a little better . I can achieve what I want with a simple Contains() - but trying to see what the equivalent would be using Regular Expressions.
The regex I have doesn't work the way I expect it to , can somebody point me to the equivalent regex and explain what I have gotten wrong ?
Clear Log(); Clear Globals();
dt = Open( "$SAMPLE_DATA/Cities.jmp" );
dt:city[2] = "aLBUQUERQUE";
CityList = dt:City << Get Values;
Des = list();
for(i = 1 , i <= N Items(CityList),i++,
If(!IsMissing(Regex(Char(CityList[i]),"^a|A[LB]*")),
Insert Into(Des,CityList[i]);
);
);
/*for(i = 1 , i <= N Items(CityList) , i++,
If(Contains(CityList[i],"aLB")|Contains(CityList[i],"ALB"),
Insert Into(Des,CityList[i]);
);
);*/
Show(Des);
//Close All(Data Tables,"No Save");
Uday
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex help
Right now you are matching either an 'a' at the beginning of the string ('^a') or an 'A' followed by zero or more letters 'L' or 'B' ('A[LB]*'). Some notes:
- There are no parenthesis around the or so the match is either the entire string before or after the '|'. So if the string starts with 'a' it is done and will not match anything else.
- The square brackets mean match any character inside them.
- The asterisk means match zero or more of the preceeding character or group, in this case that means the group inside the squre brackets.
- Together, right side of the or will match any of these, anywhere inside your string (not just at the beginning):
- A
- AL
- AB
- ALB
- ABL
- ALLLLLLLLLLBLLBBBBLLBLLB
- The string aA would actually be two different matches of the Regex function.
You probably want this: "^(a|A)LB.*$", or this: "^[aA]LB.*$"
- Starting at the beginning of the string, match an a or A, followed by
- The characters LB, in order, followed by
- Zero or more of any character, the period is any character (.*), followed by
- The end of the string. This is probably not necessary.
Des = {"ALBANY", "aLBUQUERQUE"};
Check out regexr.com, it not only helps check your code but has good 'reference' and 'cheatsheet' sections on the left. I almost always turn multiline on (flags in the upper right).
Edited to clarify that match can be anywhere inside the string.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Regex help
Right now you are matching either an 'a' at the beginning of the string ('^a') or an 'A' followed by zero or more letters 'L' or 'B' ('A[LB]*'). Some notes:
- There are no parenthesis around the or so the match is either the entire string before or after the '|'. So if the string starts with 'a' it is done and will not match anything else.
- The square brackets mean match any character inside them.
- The asterisk means match zero or more of the preceeding character or group, in this case that means the group inside the squre brackets.
- Together, right side of the or will match any of these, anywhere inside your string (not just at the beginning):
- A
- AL
- AB
- ALB
- ABL
- ALLLLLLLLLLBLLBBBBLLBLLB
- The string aA would actually be two different matches of the Regex function.
You probably want this: "^(a|A)LB.*$", or this: "^[aA]LB.*$"
- Starting at the beginning of the string, match an a or A, followed by
- The characters LB, in order, followed by
- Zero or more of any character, the period is any character (.*), followed by
- The end of the string. This is probably not necessary.
Des = {"ALBANY", "aLBUQUERQUE"};
Check out regexr.com, it not only helps check your code but has good 'reference' and 'cheatsheet' sections on the left. I almost always turn multiline on (flags in the upper right).
Edited to clarify that match can be anywhere inside the string.