- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How do I use Regex substitution?
Replaces the specified "(" with a newline character and removes all non-Chinese and non-numeric characters.
d试1(-是23(a文字789(+内c容4$3
txt="d试1(-是23(a文字789(+内c容4$3";
na=Regex(txt, ?? );
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
use two statements. I added some extra problem characters to your example...
txt = "d试1(-是2^3([a]文/字789(+内\c容4$3";
step1 = Regex( txt, "\(", "\!n", globalreplace );
step2 = Regex( step1, "[-+/\\^$[\]a-zA-Z]", "", globalreplace );
The ( needs to be regex-escaped for step1. The newline is JSL-escaped.
For step2, the syntax of the [...] is complicated:
- start with the - because without a leading character it can't be a range like a-z. (Otherwise it needs a regex-escape, \- )
- the \ needs to be regex-escaped
- the ] needs to be regex-escaped
Be careful not to accidentally write the JSL-escape \[ which changes the way JSL interprets strings. This is likely to happen when working with character sets in [...] if you don't know it will be a problem.
Notice the sequence [\]. It is not a character set. It is 2 characters that are in a character set.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
Thank Craige!
I'll just use EmEditor to re replace
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
Thank Craige!
If I want to take this re substitution one step further: insert tabs between the Chinese characters and the data (set the Chinese characters in the original text to the left and the numbers to the right), to achieve a table that can input the result of the substitution directly into the JMP:
How do you need to change the code?
txt = "d试1(-是2^3([a]文/字789(+内\c容4$3";
step1 = Regex( txt, "\(", "\!n", globalreplace );//??
step2 = Regex( step1, "[-+/\\^$[\]a-zA-Z]", "", globalreplace );//??
s = N Items( step2 );
dt = New Table( "A", Add Rows( s ), New Column( "txt", Character, "Nominal" ), New Column( "num" ) );
dt[0, 0] = step2;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
This is an example for negative look behind and positive look ahead, lookaround. Step 3 is using both.
(?<!\d)
matches a single character that is not a digit just before the current zero-length position.
(?=\d)
matches a single character that is a digit just after the current zero-length position.
The tab character is inserted into the zero-length position to separate the non-digit from the digit.
txt = "d试1(-是2^3([a]文/字789(+内\c容4$3";
// change paren to newline...
step1 = Regex( txt, "\(", "\!n", globalreplace );
// remove unwanted characters - + / \ ^ $ [ ] a..z A..Z
step2 = Regex( step1, "[-+/\\^$[\]a-zA-Z]", "", globalreplace );
// insert tab when character before is not a digit and character after is a digit
step3 = Regex( step2, "(?<!\d)(?=\d)", "\!t", globalreplace );
// add header
step4 = "aaa\!tbbb\!n" || step3;
// import the string without making a file on disk
Open( Char To Blob( step4 ), "text" );
tabs separate fields
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
Thank Craige!
Look at the code you write, a real enjoyment!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
It can't learn these advanced techniques in a script guide.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
Thanks!
It is hard to make good examples for some features in regex. This is a good example for lookaround. I think I could do it without lookaround, but it would be more complicated. @shannon_conners for the doc team: using positive and negative, ahead and behind, together, is what makes this compact and expressive (with a comment). It is an unusual situation, not mainstream usage.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
Thank Craige!
I'd like to ask you a question about the underlying application:
I want to use JSL to solve more usage, can not use python and other tools.
Of course, I learned all of the JSL code from the many experts in this community.
My request is specifically:
how to restore protobuf(Google Protocol Buffe)-formatted binary text to JMP text ues JSL.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How do I use Regex substitution?
Interesting https://developers.google.com/protocol-buffers/docs/overview
It looks like you have two choices: write your own https://github.com/protocolbuffers/protobuf/blob/master/docs/third_party.md which would then work for many protocols,
or, easier and harder,
pick an existing one and translate. I have not looked at it, but I might try starting with a JAVA version and translate that to JSL. But then you get to repeat the translation for the next protocol.
It would be a pretty cool project to make a translator. I do not plan to do that.