cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
markschahl
Level V

Extract all desired elements from a string to enable multiple response analysis

Hi! I have data from a SharePoint which we use to (among other things) to track what tool/technique(s) was/were used to solve a problem. This includes JMP, BTW.  The data (in one column) follow this type of ;-delimited formatting:

<technique description>;#<technique ID>;<technique description>;#<technique ID>;<technique description>;#<technique ID>...

3.4 Offline Optimization & Simulation Tools;#15;#3.9 Simulation;#30;#4.6 Material Balance;#22

 

Each row in the table (problem solved) can have one or more <technique ID>. Sometimes the <description> starts with #, sometimes not. But <technique ID> is always preceded by #.

 

I would like to parse the string for the <technique ID>'s  and format for Multiple Response analysis: 15, 30, 22 (above example).

 

  1. There are ~100 <technique ID>, so creating ~100 sparse columns seems wasteful.  Is the approach used in the Sample Data file Failures3Delimited.jmp the most efficient way to approach this problem?
  2. I've started to learn REGEX and have been able to find one occurrence of <technique ID>, but not all. Do I need a loop? Some other way to do this?

Thanks. Using JMP17.1 Pro.

1 ACCEPTED SOLUTION

Accepted Solutions
hogi
Level XII

Re: Extract all desired elements from a string to enable multiple response analysis

@hogi wrote:

... complicated split first-second ... 

actually -  less complicated than I thought

 

New Column( "IDs",
	Character,
	"Multiple Response",
	Formula(
		splitItems = Words( :Input, ";#" );
		Concat Items( splitItems[2:: N Items( splitItems ):: 2], "," );
	)
)

 

... and one of the next versions of Jmp will make it even easier:
https://community.jmp.com/t5/JMP-Wish-List/new-slicing-options-for-matrices/idc-p/614326/highlight/t... 

 

 

View solution in original post

9 REPLIES 9
txnelson
Super User

Re: Extract all desired elements from a string to enable multiple response analysis

Here is a potential solution that I came up with:

txnelson_0-1697159508330.png

Here is the formula I used

i = 1;
string = "";
While( Word( i, :Input, "#;" ) != "",
	theWord = Word( i, :Input, "#;" );
	If( Is Missing( Num( theWord ) ) == 0,
		string = string || "," || theWord
	);
	i++;
);
Try( Substr( string, 2 ) );
Jim
hogi
Level XII

Re: Extract all desired elements from a string to enable multiple response analysis

Very elegant - instead of a complicated split first-second (twice the same delimiter) or some which part is the TEXT, just:

find the NUMBERS

Let's hope there is no ID #30-2  ...

or a description: 42

 

New Column( "IDs",
	Character,
	"Multiple Response",
	Formula(
		Concat Items(
			Transform Each( {theWord}, Words( :Input, ";#" ),
				Char( Num( theWord ) )
			),
			""
		)
	),
	Set Property( "Multiple Response", Multiple Response( Separator( "." ) ) ),
)

 

markschahl
Level V

Re: Extract all desired elements from a string to enable multiple response analysis

Thanks, Jim! I was thinking that some sort of loop would do the trick, with or without REGEX. Hope that you are well!

jthi
Super User

Re: Extract all desired elements from a string to enable multiple response analysis

Won't go suggesting using JMP's Character Pattern as it is fairly difficult to understand (at least all the different functionalities) but I did manage (maybe) to do this using it

Names Default To Here(1);

str = "3.4 Offline Optimization & Simulation Tools;#15;#3.9 Simulation;#30;#4.6 Material Balance;#22";

idlist2 = {};
resval = "";
Pat Match(str, Pat Span("#") + Pat Regex("\d+?") >> next_id + Pat Regex("(;|$)") + Pat Test(
	Insert Into(idlist2, next_id); resval = Concat Items(idlist2, ", "))
);
show(idlist2, resval);

here is example using it in formula

Names Default To Here(1);

dt = New Table("Untitled",
	Add Rows(1),
	Compress File When Saved(1),
	New Column("Column 1",
		Character,
		"Nominal",
		Set Values(
			{
			"3.4 Offline Optimization & Simulation Tools;#15;#3.9 Simulation;#30;#4.6 Material Balance;#22"
			}
		)
	),
	New Column("Column 2",
		Character,
		"Multiple Response",
		Formula(
			idlist2 = {};
			resval = "";
			Pat Match(
				:Column 1,
				Pat Span("#") + Pat Immediate(Pat Regex("\d+?"), next_id)
				+Pat Regex("(;|$)") + Pat Test(
					Insert Into(idlist2, next_id);
					resval = Concat Items(idlist2, ", ");
				)
			);
			resval;
		),
		Set Property("Multiple Response", Multiple Response(Separator(",")))
	)
)
-Jarmo
hogi
Level XII

Re: Extract all desired elements from a string to enable multiple response analysis

@hogi wrote:

... complicated split first-second ... 

actually -  less complicated than I thought

 

New Column( "IDs",
	Character,
	"Multiple Response",
	Formula(
		splitItems = Words( :Input, ";#" );
		Concat Items( splitItems[2:: N Items( splitItems ):: 2], "," );
	)
)

 

... and one of the next versions of Jmp will make it even easier:
https://community.jmp.com/t5/JMP-Wish-List/new-slicing-options-for-matrices/idc-p/614326/highlight/t... 

 

 

markschahl
Level V

Re: Extract all desired elements from a string to enable multiple response analysis

hogi: Thanks for both your submissions! I'd used Words() before, but didn't think of it because early on I convinced myself that I would have to pursue a REGEX solution. This solution is very elegant! I need to read up on Concat Items()...

Jim/Jarmo/hogi: you have showed the global power of the community! I now sit in Kuala Lumpur. I posted this first thing my morning when you were likely about to sleep/already sleeping. I got a solution from you by my late afternoon. Thank you again!

markschahl
Level V

Re: Extract all desired elements from a string to enable multiple response analysis

Jarmo: Thanks for your effort! I read the REGEX and Pat Match sections in the Scripting Manual and decided to not try Pat Match unless absolutely necessary...

Re: Extract all desired elements from a string to enable multiple response analysis

The necessary change is the current delimiter to one JMP recognizes as multiple responses. The Substitute function would also work well in this case.

hogi
Level XII

Re: Extract all desired elements from a string to enable multiple response analysis

?