<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Regex: can't get all parts of the string that match in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/732298#M91422</link>
    <description>&lt;P&gt;Better for a future maintainer of the JSL, or better for the computer, or some other measure? Here's a couple of ideas; I lean towards the first one, partly because I imagine you started with a JSL expression, not a text string.&lt;/P&gt;
&lt;P&gt;A note about comments and seeing trees vs seeing forests: my comments are tree-level. They don't describe your goal (the forest), they describe the trees. Anyone maintaining your JSL in the future will want both kinds. A forest level comment should explain why you are thinking about these values. The tree level is about how a non-obvious bit of code works.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;ex = "Select Where( :\!"BWA\!"n == \!"BWA005\!" | :\!"BWA\!"n == \!"BWAZ006\!" )";
// write(ex) // Select Where( :"BWA"n == "BWA005" | :"BWA"n == "BWAZ006" )
// Regex match( ex, "[\!"](BWA|BWAZ|BWAZ_)([0-9]{1,})(.*?)[\!"]" )

// desired result: "BWA005,BWAZ006"
// 
// it isn't 100% clear what the rules are. These examples make different assumptions.

// assume the text is valid JSL that can be parsed back into an expression
// below, also assumes | but not &amp;amp; and no nested parens
expression = Parse( ex ); // Select Where( :BWA == "BWA005" | :BWA == "BWAZ006" )
// get the argument of the select where(...)
expression = Arg( expression, 1 ); // :BWA == "BWA005" | :BWA == "BWAZ006"
// assume there are only | (or) operators, change x|y|z =&amp;gt; {x,y,z}
Substitute Into( expression, Expr( Or() ), {} ); // {:BWA == "BWA005", :BWA == "BWAZ006"}
result = {}; // accumulate answer in a result list
// process the expression list
For Each( {op}, expression, // :BWA == "BWA005"   etc.
	// the Right Hand Side of :BWA == "BWA005" is "BWA005"
	RHS = Arg( Name Expr( op ), 2 ); // assign(:BWA,"BWA005"), want 2nd arg
	// apply the test to see if this is one to keep
	If( !Is Missing( Regex( RHS, "(BWA|BWAZ|BWAZ_)([0-9]+)" ) ),
		Insert Into( result, Arg( op, 2 ) ); // yes: keep it in the result list
	);
);
// join the list of RHS strings, separated by commas
Show( Concat Items( result, "," ) ); // Concat Items(result, ",") = "BWA005,BWAZ006";

&lt;BR /&gt;// alternate example&lt;BR /&gt;&lt;BR /&gt;
// assume the strings can't collide with column names, perhaps because no
// column name has a suffix number and all strings do have a suffix number
result = {};
Pat Match(
	ex,
	// this is the pattern; it repeats as long as it can
	Pat Repeat(
		// either match the BWA... pattern, in quotes, stashing the regex match into the result:
		( "\!"" + Pat Regex( "(BWA|BWAZ|BWAZ_)([0-9]+)" ) &amp;gt;&amp;gt; result[N Items( result ) + 1] + "\!"" )
		// &amp;gt;&amp;gt; is the patImmediate() operator that copies the LHS match into the RHS location
		// the list is initially 0 items long, and [nitems+1] extends the list by one more item
	| // OR the alternative... 
		Pat Len( 1 ) // advance one character. This is how most of the text is matched.
	)
);
Show( Concat Items( result, "," ) ); // Concat Items(result, ",") = "BWA005,BWAZ006";

// you *could* write a pattern match that would parse the text as an expression,
// but that is what the first example did.&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Mon, 11 Mar 2024 11:59:08 GMT</pubDate>
    <dc:creator>Craige_Hales</dc:creator>
    <dc:date>2024-03-11T11:59:08Z</dc:date>
    <item>
      <title>Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/730835#M91373</link>
      <description>&lt;P class=""&gt;&lt;SPAN class=""&gt;I'm trying to extract all the parts of the string that match, but I'm using regex match and I can't get the "BWA005, BWAZ006" that I want.&lt;/SPAN&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;ex="Select Where( :\!"BWA\!"n == \!"BWA005\!" | :\!"BWA\!"n == \!"BWAZ006\!" )";
Regex match( ex, "[\!"](BWA|BWAZ|BWAZ_)([0-9]{1,})(.*?)[\!"]" )&lt;/CODE&gt;&lt;/PRE&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;&lt;P class=""&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Mar 2024 09:22:45 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/730835#M91373</guid>
      <dc:creator>lehaofeng</dc:creator>
      <dc:date>2024-03-08T09:22:45Z</dc:date>
    </item>
    <item>
      <title>Re: Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/730877#M91377</link>
      <description>&lt;P&gt;I'm sure this is not the best solution, but Regex Match() doesn't seem to do what I'd like it to do, like python's re.findall.&amp;nbsp; You can get a list of all matches with this crude script.&amp;nbsp; Hopefully, someone knows a better way.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Names Default To Here( 1 );
ex = "Select Where( :\!"BWA\!"n == \!"BWA005\!" | :\!"BWA\!"n == \!"BWAZ006\!" )";
matchlist = {};
While( !Is Missing( Regex( ex, "(BWA|BWAZ|BWAZ_)(\d{1,})" ) ),
	a = Regex( ex, "(BWA|BWAZ|BWAZ_)(\d{1,})" );
	Insert Into( matchlist, a );
	Substitute Into( ex, a, "" );
);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 08 Mar 2024 13:28:16 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/730877#M91377</guid>
      <dc:creator>mmarchandTSI</dc:creator>
      <dc:date>2024-03-08T13:28:16Z</dc:date>
    </item>
    <item>
      <title>Re: Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/730902#M91380</link>
      <description>&lt;P&gt;Actually, in case there are duplicates in there that you want to see, like { "BWA005", "BWAZ006", "BWA005" }, you would want to do it this way instead:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Names Default To Here( 1 );
ex = "Select Where( :\!"BWA\!"n == \!"BWA005\!" | :\!"BWA\!"n == \!"BWAZ006\!" )";
matchlist = {};
While( !Is Missing( Regex( ex, "(BWA|BWAZ|BWAZ_)(\d{1,})" ) ),
	a = Regex( ex, "(BWA|BWAZ|BWAZ_)(\d{1,})" );
	Insert Into( matchlist, a );
	b = Contains( ex, a );
	c = Length( a );
	ex = Substr( ex, b + c );
);&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 08 Mar 2024 15:44:12 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/730902#M91380</guid>
      <dc:creator>mmarchandTSI</dc:creator>
      <dc:date>2024-03-08T15:44:12Z</dc:date>
    </item>
    <item>
      <title>Re: Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/731875#M91406</link>
      <description>&lt;P&gt;Thank you ! It works!&lt;/P&gt;&lt;P&gt;I want to know if there is a better way to do this.&lt;/P&gt;</description>
      <pubDate>Sun, 10 Mar 2024 10:44:10 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/731875#M91406</guid>
      <dc:creator>lehaofeng</dc:creator>
      <dc:date>2024-03-10T10:44:10Z</dc:date>
    </item>
    <item>
      <title>Re: Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/732298#M91422</link>
      <description>&lt;P&gt;Better for a future maintainer of the JSL, or better for the computer, or some other measure? Here's a couple of ideas; I lean towards the first one, partly because I imagine you started with a JSL expression, not a text string.&lt;/P&gt;
&lt;P&gt;A note about comments and seeing trees vs seeing forests: my comments are tree-level. They don't describe your goal (the forest), they describe the trees. Anyone maintaining your JSL in the future will want both kinds. A forest level comment should explain why you are thinking about these values. The tree level is about how a non-obvious bit of code works.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;ex = "Select Where( :\!"BWA\!"n == \!"BWA005\!" | :\!"BWA\!"n == \!"BWAZ006\!" )";
// write(ex) // Select Where( :"BWA"n == "BWA005" | :"BWA"n == "BWAZ006" )
// Regex match( ex, "[\!"](BWA|BWAZ|BWAZ_)([0-9]{1,})(.*?)[\!"]" )

// desired result: "BWA005,BWAZ006"
// 
// it isn't 100% clear what the rules are. These examples make different assumptions.

// assume the text is valid JSL that can be parsed back into an expression
// below, also assumes | but not &amp;amp; and no nested parens
expression = Parse( ex ); // Select Where( :BWA == "BWA005" | :BWA == "BWAZ006" )
// get the argument of the select where(...)
expression = Arg( expression, 1 ); // :BWA == "BWA005" | :BWA == "BWAZ006"
// assume there are only | (or) operators, change x|y|z =&amp;gt; {x,y,z}
Substitute Into( expression, Expr( Or() ), {} ); // {:BWA == "BWA005", :BWA == "BWAZ006"}
result = {}; // accumulate answer in a result list
// process the expression list
For Each( {op}, expression, // :BWA == "BWA005"   etc.
	// the Right Hand Side of :BWA == "BWA005" is "BWA005"
	RHS = Arg( Name Expr( op ), 2 ); // assign(:BWA,"BWA005"), want 2nd arg
	// apply the test to see if this is one to keep
	If( !Is Missing( Regex( RHS, "(BWA|BWAZ|BWAZ_)([0-9]+)" ) ),
		Insert Into( result, Arg( op, 2 ) ); // yes: keep it in the result list
	);
);
// join the list of RHS strings, separated by commas
Show( Concat Items( result, "," ) ); // Concat Items(result, ",") = "BWA005,BWAZ006";

&lt;BR /&gt;// alternate example&lt;BR /&gt;&lt;BR /&gt;
// assume the strings can't collide with column names, perhaps because no
// column name has a suffix number and all strings do have a suffix number
result = {};
Pat Match(
	ex,
	// this is the pattern; it repeats as long as it can
	Pat Repeat(
		// either match the BWA... pattern, in quotes, stashing the regex match into the result:
		( "\!"" + Pat Regex( "(BWA|BWAZ|BWAZ_)([0-9]+)" ) &amp;gt;&amp;gt; result[N Items( result ) + 1] + "\!"" )
		// &amp;gt;&amp;gt; is the patImmediate() operator that copies the LHS match into the RHS location
		// the list is initially 0 items long, and [nitems+1] extends the list by one more item
	| // OR the alternative... 
		Pat Len( 1 ) // advance one character. This is how most of the text is matched.
	)
);
Show( Concat Items( result, "," ) ); // Concat Items(result, ",") = "BWA005,BWAZ006";

// you *could* write a pattern match that would parse the text as an expression,
// but that is what the first example did.&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 11 Mar 2024 11:59:08 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/732298#M91422</guid>
      <dc:creator>Craige_Hales</dc:creator>
      <dc:date>2024-03-11T11:59:08Z</dc:date>
    </item>
    <item>
      <title>Re: Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/732356#M91427</link>
      <description>&lt;P&gt;For Regex Match there is wish&lt;BR /&gt;&lt;LI-MESSAGE title="Add flag to Regex Match() to find all non-overlapping occurances of pattern" uid="582080" url="https://community.jmp.com/t5/JMP-Wish-List/Add-flag-to-Regex-Match-to-find-all-non-overlapping-occurances/m-p/582080#U582080" discussion_style_icon_css="lia-mention-container-editor-message lia-img-icon-idea-thread lia-fa-icon lia-fa-idea lia-fa-thread lia-fa"&gt;&lt;/LI-MESSAGE&gt;&amp;nbsp;. Depending what you are trying to do (+ why and where) there are many different options of handling this (to name few which are already mentioned: loops and jmp's own pattern matching) but there could be more options more suitable for your use case.&lt;/P&gt;</description>
      <pubDate>Mon, 11 Mar 2024 14:14:59 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/732356#M91427</guid>
      <dc:creator>jthi</dc:creator>
      <dc:date>2024-03-11T14:14:59Z</dc:date>
    </item>
    <item>
      <title>Re: Regex: can't get all parts of the string that match</title>
      <link>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/747010#M92651</link>
      <description>&lt;P&gt;With Jmp18, it got much easier just to use Python's re.findall:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;ex="Select Where( :\!"BWA\!"n == \!"BWA005\!" | :\!"BWA\!"n == \!"BWAZ006\!" )";
Python send(ex);
Python Submit ("
import re
matches = re.findall(r'(BWA|BWAZ)([0-9]+)',ex)
");
Python get (matches)&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 13 Apr 2024 19:11:43 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Regex-can-t-get-all-parts-of-the-string-that-match/m-p/747010#M92651</guid>
      <dc:creator>hogi</dc:creator>
      <dc:date>2024-04-13T19:11:43Z</dc:date>
    </item>
  </channel>
</rss>

