cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
lala
Level VII

Why does regex replacement not remove empty lines?

The purpose is to download web content, keep the visible content, and remove lines with only Spaces.
I use the following JSL but can't remove blank lines?

 

Thanks!

u = "https://www.jmp.com/support/help/zh-cn/17.2/jmp/jsl-terminology.shtml";
txt = Load Text File( u );
a = Length( txt );
t1 = Regex( txt, "<(.[^>]{0,})>", "", globalreplace );
t2 = Regex( t1, "  ", "", globalreplace );
t2 = Regex( t2, "^( {0,})\!n", "", globalreplace );
t2 = Regex( t2, "\!n\!n", "", globalreplace );
14 REPLIES 14
lala
Level VII

回复: Why does regex replacement not remove empty lines?

The underlying knowledge of regular expressions is still difficult to understand.

Thanks for the expert's help.

Craige_Hales
Super User

回复: Why does regex replacement not remove empty lines?

@Jeff_Perkinson 

The site is sending you an "unsupported browser" page rather than the data you are expecting (your new link is slightly different).

Talk with the JMP sales team about your needs. I believe some documentation is already translated.

 

This works:

 

u = "https://www.jmp.com/support/help/zh-cn/17.2/jmp/jsl-terminology.shtml";
txt = Load Text File( u );
regexmatch(txt,"JSLString.+?</p>")

 

{
"JSLString\!">\!"Hello, World\!"</span>;</pre>
<pre id=\!"ww220404\!" class=\!"code\!"><span class=\!"JSLOperatorName\!">Show</span>( A );</pre>
<p id=\!"ww220406\!" class=\!"codeOutput\!">A = \!"Hello, World\!";</p>"
}

 

Craige
lala
Level VII

回复: Why does regex replacement not remove empty lines?

Thank Craige!

 

Yes, most of the help documentation for JMP has been translated.However, the Script and JSL parts are not translated.

If use Google Translate, the JSL code content inside will also be translated, and do not use reading.

 

Thanks again to the experts!Thanks to the JMP team!

lala
Level VII

回复: Why does regex replacement not remove empty lines?

Go ahead and ask the experts: Can "regexmatch" add global parameters like "globalreplace"?Enables multiple segments of content in different locations to be extracted at once.

 

Thank Craige!

Craige_Hales
Super User

回复: Why does regex replacement not remove empty lines?

No, and maybe yes.

Add flag to Regex Match() to find all non-overlapping occurances of pattern 

Regex: add options for all flags 

You can use regex inside a pattern match to do this:

u = "https://www.jmp.com/support/help/zh-cn/17.2/jmp/jsl-terminology.shtml";
txt = Load Text File( u );
matches = {}; // record matches here
rc = Pat Match(
	txt,
	Pat Repeat(
		Pat Pos() >> pos// remember the pos just before JSLString text
		+ Pat Regex( "JSLString.+?</p>" ) >> str// keep the matched text in str
		+ Pat Test( // use the test to inject some JSL into the matcher
			matches[nitems(matches)+1] = Eval List( {pos, str} ); // record the match
			1; // the "test" succeeds
		) + Pat Arb() // skip over more and more arbitrary text
	) + Pat R Pos( 0 ) // make sure reach the end
);
Show( rc, nitems(matches) );
For( i = 1, i <= nitems(matches), i += 1,
	Show( i, matches[i] )
);

Also, your regex does not really match the HTML. It will miss some and combine some. HTML is not the proper markup for reworking the text. The above finds 2 of 3 matches on the page (combining 2 and 3 into the 2nd) because the </p> does not happen between 2 and 3.

rc = 1;
N Items(matches) = 2;
...first match...
i = 1;
matches[i] = {8591, "JSLString\!">\!"Hello, World\!"</span>;</pre>
          <pre id=\!"ww220404\!" class=\!"code\!"><span class=\!"JSLOperatorName\!">Show</span>( A );</pre>
          <p id=\!"ww220406\!" class=\!"codeOutput\!">A = \!"Hello, World\!";</p>"};
... second match...
i = 2;
matches[i] = {10559, "JSLString\!">\!"My Line Graph\!"</span> ),</pre>
          <pre id=\!"ww229621\!" class=\!"code\!">		Frame Size( <span class=\!"JSLNumber\!">300</span>, <span class=\!"JSLNumber\!">500</span> ),</pre>
          <pre id=\!"ww229622\!" class=\!"code\!">		<span class=\!"JSLOperatorName\!">Marker</span>( <span class=\!"JSLOperatorName\!">Marker State</span>( <span class=\!"JSLNumber\!">3</span> ), [<span class=\!"JSLNumber\!">11</span> <span class=\!"JSLNumber\!">44</span> <span class=\!"JSLNumber\!">77</span>], [<span class=\!"JSLNumber\!">75</span> <span class=\!"JSLNumber\!">25</span> <span class=\!"JSLNumber\!">50</span>] );</pre>
          <pre id=\!"ww229623\!" class=\!"code\!">		<span class=\!"JSLOperatorName\!">Pen Color</span>( <span 
...there was no <p>, so the second match continues...
class=\!"JSLString\!">\!"Blue\!" </span>);</pre>
          <pre id=\!"ww229612\!" class=\!"code\!">		<span class=\!"JSLOperatorName\!">Line</span>( [<span class=\!"JSLNumber\!">10 30 70</span>], [<span class=\!"JSLNumber\!">88 22 44</span>] ));</pre>
          <p id=\!"ww236173\!" class=\!"body\!">Note that the <span class=\!"code\!">Frame Size()</span> arguments <span class=\!"code\!">300</span> and <span class=\!"code\!">500</span> are not named. The position of these arguments implies meaning; the first argument is always the width, the second argument is always the height.</p>"};

 

 

 

Craige