cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
lala
Level VII

How do I use Regex substitution?

Replaces the specified "(" with a newline character and removes all non-Chinese and non-numeric characters.

 

d试1(-是23(a文字789(+内c容4$3

2023-01-31_20-13-35.png

txt="d试1(-是23(a文字789(+内c容4$3";

na=Regex(txt,  ?? );

 Thanks!

12 REPLIES 12
Craige_Hales
Super User

Re: How do I use Regex substitution?

use two statements. I added some extra problem characters to your example...

txt = "d试1(-是2^3([a]文/字789(+内\c容4$3";
step1 = Regex( txt, "\(", "\!n", globalreplace );
step2 = Regex( step1, "[-+/\\^$[\]a-zA-Z]", "", globalreplace );

The ( needs to be regex-escaped for step1. The newline is JSL-escaped.

For step2, the syntax of the [...] is complicated:

  • start with the - because without a leading character it can't be a range like a-z. (Otherwise it needs a regex-escape, \- )
  • the \ needs to be regex-escaped
  • the ] needs to be regex-escaped

Be careful not to accidentally write the JSL-escape \[ which changes the way JSL interprets strings. This is likely to happen when working with character sets in [...] if you don't know it will be a problem.

Notice the sequence [\]. It is not a character set.  It is 2 characters that are in a character set.

Craige
lala
Level VII

Re: How do I use Regex substitution?

Thank Craige!

I'll just use EmEditor to re replace

lala
Level VII

Re: How do I use Regex substitution?


Thank Craige!

 

If I want to take this re substitution one step further: insert tabs between the Chinese characters and the data (set the Chinese characters in the original text to the left and the numbers to the right), to achieve a table that can input the result of the substitution directly into the JMP:
How do you need to change the code?

2023-02-01_17-58-10.png

 

txt = "d试1(-是2^3([a]文/字789(+内\c容4$3";
step1 = Regex( txt, "\(", "\!n", globalreplace );//??
step2 = Regex( step1, "[-+/\\^$[\]a-zA-Z]", "", globalreplace );//??

s = N Items( step2 );
dt = New Table( "A", Add Rows( s ), New Column( "txt", Character, "Nominal" ), New Column( "num" ) );
dt[0, 0] = step2;

 

Craige_Hales
Super User

Re: How do I use Regex substitution?

This is an example for negative look behind and positive look ahead, lookaround. Step 3 is using both.

(?<!\d)

matches a single character that is not a digit just before the current zero-length position.

(?=\d)

matches a single character that is a digit just after the current zero-length position.

The tab character is inserted into the zero-length position to separate the non-digit from the digit.

 

txt = "d试1(-是2^3([a]文/字789(+内\c容4$3";
// change paren to newline...
step1 = Regex( txt, "\(", "\!n", globalreplace );
// remove unwanted characters - + / \ ^ $ [ ] a..z A..Z
step2 = Regex( step1, "[-+/\\^$[\]a-zA-Z]", "", globalreplace );
// insert tab when character before is not a digit and character after is a digit
step3 = Regex( step2, "(?<!\d)(?=\d)", "\!t", globalreplace );
// add header
step4 = "aaa\!tbbb\!n" || step3;
// import the string without making a file on disk
Open( Char To Blob( step4 ), "text" );

tabs separate fieldstabs separate fields

Craige
lala
Level VII

Re: How do I use Regex substitution?

Thank Craige!


Look at the code you write, a real enjoyment!

lala
Level VII

Re: How do I use Regex substitution?

It can't learn these advanced techniques in a script guide.

Craige_Hales
Super User

Re: How do I use Regex substitution?

Thanks!

It is hard to make good examples for some features in regex. This is a good example for lookaround. I think I could do it without lookaround, but it would be more complicated. @shannon_conners for the doc team: using positive and negative, ahead and behind, together, is what makes this compact and expressive (with a comment). It is an unusual situation, not mainstream usage.

 

Craige
lala
Level VII

Re: How do I use Regex substitution?

Thank Craige!

 

I'd like to ask you a question about the underlying application:
I want to use JSL to solve more usage, can not use python and other tools.
Of course, I learned all of the JSL code from the many experts in this community.


My request is specifically:

how to restore protobuf(Google Protocol Buffe)-formatted binary text to JMP text ues JSL.

Craige_Hales
Super User

Re: How do I use Regex substitution?

Interesting https://developers.google.com/protocol-buffers/docs/overview

It looks like you have two choices: write your own https://github.com/protocolbuffers/protobuf/blob/master/docs/third_party.md which would then work for many protocols,

 

or, easier and harder,

 

pick an existing one and translate. I have not looked at it, but I might try starting with a JAVA version and translate that to JSL. But then you get to repeat the translation for the next protocol.

 

It would be a pretty cool project to make a translator. I do not plan to do that.

Craige