<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Extracting a character string of varying lengths and positions in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46405#M26449</link>
    <description>&lt;P&gt;The code in included is a Column Formula.&amp;nbsp; I have attached a data table with the Dx1 column you used in your example, along with the new Var1 and Var2 columns that have their Column Formulas specified.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 25 Oct 2017 18:22:41 GMT</pubDate>
    <dc:creator>txnelson</dc:creator>
    <dc:date>2017-10-25T18:22:41Z</dc:date>
    <item>
      <title>Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46335#M26405</link>
      <description>&lt;P&gt;I want to extract character strings from a variable (Dx1), but the length and position of the strings varies, but they are separated by dashes. Sometimes there are multiple dashes. For example...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"7812-abcdef&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -9"&lt;/P&gt;
&lt;P&gt;"v7892-xyz&amp;nbsp;&amp;nbsp;&amp;nbsp; -0"&lt;/P&gt;
&lt;P&gt;"812-fghkj&amp;nbsp;&amp;nbsp;&amp;nbsp; -0"&lt;/P&gt;
&lt;P&gt;"17361-gf-jhyt&amp;nbsp;&amp;nbsp; -9"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I would like to do is create a new variable that extracts the digits/characters before the first dash, and then another variable that extracts the characters between the first and second dashes.&amp;nbsp;So,&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Var1&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; Var2&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;7812&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; abcdef&lt;/P&gt;
&lt;P&gt;v7892&amp;nbsp;&amp;nbsp;&amp;nbsp; xyz&lt;/P&gt;
&lt;P&gt;812&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; fghkj&lt;/P&gt;
&lt;P&gt;17361&amp;nbsp;&amp;nbsp;&amp;nbsp; gf-jhyt&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I tried following the example posted here, but it just extracted the&amp;nbsp;dash. Changing the +1 to another number extacts that number of characters, but my data length varies.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://community.jmp.com/t5/Uncharted/JSL-Character-String-Functions/ba-p/21323" target="_blank"&gt;https://community.jmp.com/t5/Uncharted/JSL-Character-String-Functions/ba-p/21323&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here's what I wrote. What am I doing wrong?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Trim(
Substr(
:Name( "Dx1" ),
Contains( :Name( "Dx1" ), "-" ),
1
)
)&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 13:19:27 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46335#M26405</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-25T13:19:27Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46336#M26406</link>
      <description>&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/7258"&gt;@jswislar&lt;/a&gt;&amp;nbsp;:&amp;nbsp;&lt;BR /&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;One way to tackle this could be to use Contains and Substr functions&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;a = {"abc-12","abc - 45","abc-  89"}; 
List1 = list(); 
List2 = list(); 
For(i = 1 , i &amp;lt;= N Items(a), i++,
		Pos = Contains(a[i],"-"); 
		Insert Into(List1,Substr(a[i],1,Pos-1)); 
		Insert Into(List2,Substr(a[i],Pos+1,Length(a[i]))); 
   );&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 14:19:04 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46336#M26406</guid>
      <dc:creator>uday_guntupalli</dc:creator>
      <dc:date>2017-10-24T14:19:04Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46343#M26410</link>
      <description>&lt;P&gt;You want to use the Word() function.&amp;nbsp; If you create a new column, "Var1", and place the below formula into the column, you will get the results you want&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;word(1,:Dx1,"-")&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;For Var2 it would be:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;word(2,:Dx1,"-")&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The Word() function is documented in the Scripting Index&lt;/P&gt;
&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp;Help==&amp;gt;Scripting Index==&amp;gt;Word&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 15:27:47 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46343#M26410</guid>
      <dc:creator>txnelson</dc:creator>
      <dc:date>2017-10-24T15:27:47Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46344#M26411</link>
      <description>&lt;P&gt;Thanks! That works well.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 15:30:11 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46344#M26411</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-24T15:30:11Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46345#M26412</link>
      <description>&lt;P&gt;The &lt;STRONG&gt;words&lt;/STRONG&gt; function is your friend here.&amp;nbsp; Use the dash and space as delimiters and you can extract what you want easily.&amp;nbsp; Here's an example that uses formulas in a table:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;dt = New Table( "Untitled", Add Rows( 4 ), 
	New Column( "dx1", Character, "Nominal",
		Set Values( {"7812-abcdef     -9", "v7892-xyz    -0", "812-fghkj    -0",
			"17361-gf-jhyt   -9"}
		)
	),
// Get the first word for Var1
	New Column( "Var1", Character, "Nominal", Formula( Words( :dx1, "- " )[1] ) ),

// Get the second word for Var1
	New Column( "Var2", Character, "Nominal", Formula( Words( :dx1, "- " )[2] ) )
);&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 24 Oct 2017 15:35:36 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46345#M26412</guid>
      <dc:creator>pmroz</dc:creator>
      <dc:date>2017-10-24T15:35:36Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46346#M26413</link>
      <description>&lt;P&gt;Is there a way to get WORD to ignore some dashes? Some of the second words I'm trying to extract are hyphenated, so I'm only getting a partial extraction. For example, for "938-ex-marine" I get "938" and "ex" rather than "ex-marine"&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 15:35:48 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46346#M26413</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-24T15:35:48Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46350#M26416</link>
      <description>&lt;P&gt;JSL will allow the building and parsing of any these complex conditions.&amp;nbsp; It is up to the humans to determine what the rules are for such parsing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In your example, what is the rule to determine what hyphenated values should be read as a combined value, and when to treat them as separate values?&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 15:54:47 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46350#M26416</guid>
      <dc:creator>txnelson</dc:creator>
      <dc:date>2017-10-24T15:54:47Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46351#M26417</link>
      <description>&lt;P&gt;The only rule that I can think of that would be consistent would be that the second word appears beween numbers (#). Ultimately I want the phrase between the hyphens. So, "#-word&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -#" and "#-word-word&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -#" and "#-word-word word&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; -#"&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 16:03:58 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46351#M26417</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-24T16:03:58Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46354#M26419</link>
      <description>&lt;P&gt;Here is a formula that will do what you stated as the rule....&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Final_Value = Word( 2, :Dx1, "-" );
i = 3;
(While( Is Missing( Num( Word( i, :Dx1, "-" ) ) ) == 1,
	Final_Value = Final_Value || "-" || Word( i, :Dx1, "-" );
	i++;
) ; Final_Value);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;However, in your sample above, the sample "17361-gf-jhyt&amp;nbsp; &amp;nbsp;-9" ends up with a Var2 value of "gf-jhyt".&amp;nbsp; Is that what you want?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Regardless, I hope you can&amp;nbsp; see that you can construct as complex of a formula as you may need to parse the string into whatevery the correct values are.&amp;nbsp; The only limitation is that you need to be able to determine the rules that need to be used.&lt;/P&gt;</description>
      <pubDate>Tue, 24 Oct 2017 17:18:08 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46354#M26419</guid>
      <dc:creator>txnelson</dc:creator>
      <dc:date>2017-10-24T17:18:08Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46385#M26442</link>
      <description>Thanks, Jim. That is what I was looking for.&lt;BR /&gt;How do I run this script? I tried it as a column formula. That didn't work, but I didn't expect it to. I also opened a script window and ran it there, but it just stopped responding.</description>
      <pubDate>Wed, 25 Oct 2017 13:18:15 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46385#M26442</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-25T13:18:15Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46405#M26449</link>
      <description>&lt;P&gt;The code in included is a Column Formula.&amp;nbsp; I have attached a data table with the Dx1 column you used in your example, along with the new Var1 and Var2 columns that have their Column Formulas specified.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 25 Oct 2017 18:22:41 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46405#M26449</guid>
      <dc:creator>txnelson</dc:creator>
      <dc:date>2017-10-25T18:22:41Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46427#M26464</link>
      <description>&lt;P&gt;Thanks, Jim. That is how I tried it previouslly. For some reason it seems to crash JMP Pro 13 when I run it. Maybe because my data table is 65,000 lines?&lt;/P&gt;</description>
      <pubDate>Thu, 26 Oct 2017 15:42:25 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46427#M26464</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-26T15:42:25Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46435#M26467</link>
      <description>&lt;P&gt;65,000 rows is not a very large table for JMP to handle.&amp;nbsp; JMP can handle data tables with billions of rows.&amp;nbsp; What the issue is, is data inconsistancies in your data table.&amp;nbsp; The code assumes that at least one numeric field will be found after finding the character values for Var2.&amp;nbsp; Apparently there is a situation where a value of Dx1 does not end with numbers.&amp;nbsp; The following is a change in the format that adjusts for that issue&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;Final_Value = Word( 2, :Dx1, "-" );
i = 3;
(While( Is Missing( Num( Word( i, :Dx1, "-" ) ) ) == 1 &amp;amp; Word( i, :Dx1, "-" ) != "",
	Final_Value = Final_Value || "-" || Word( i, :Dx1, "-" );
	i++;
) ; Final_Value);&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 26 Oct 2017 19:46:22 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46435#M26467</guid>
      <dc:creator>txnelson</dc:creator>
      <dc:date>2017-10-26T19:46:22Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46444#M26474</link>
      <description>&lt;P&gt;Thanks again, Jim.&lt;/P&gt;&lt;P&gt;I've used many more rows than this. It was just a hypothesis.&lt;/P&gt;&lt;P&gt;Every case of DX1 ends in a number, but the number of spaces between what I'm extracting for Var2 and that number varies. Maybe that was causing it. Either way, your solution worked.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Oct 2017 21:08:40 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46444#M26474</guid>
      <dc:creator>jswislar</dc:creator>
      <dc:date>2017-10-26T21:08:40Z</dc:date>
    </item>
    <item>
      <title>Re: Extracting a character string of varying lengths and positions</title>
      <link>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46496#M26506</link>
      <description>&lt;P&gt;The &lt;EM&gt;Regex()&lt;/EM&gt; function could be an efficient alternative here if you experience stability or performance problems (and a&amp;nbsp;while-loop that goes infinite in a column formula&amp;nbsp;can be frustrating&amp;nbsp;to debug).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;//Formula for Var1
Regex(Dx1, "(^.*?)-", "\1");

//Formula for Var2
Regex(Dx1, "-(.+)-\d", "\1");
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 28 Oct 2017 21:54:37 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Extracting-a-character-string-of-varying-lengths-and-positions/m-p/46496#M26506</guid>
      <dc:creator>ms</dc:creator>
      <dc:date>2017-10-28T21:54:37Z</dc:date>
    </item>
  </channel>
</rss>

