cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Dipesh
Level I

Extract string from name

Hi, 

 

I would like to extract a string of data from a name. The position of the string varies in the name and the name has a variable number of sub-strings within it. An example is shown below:

 

(GPN) CI-4/E7/DH-1 15S-40 L1807 
DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807 

(GPN) JD CI-4/E7/DH-1 40 L1807 

 

I am interested in two parts: a string which contains the letter S (so in the examples, 15S-40 & 5S-40) AND if w is not present, I am interested in the numeric value. 

 

Is it possible to extract this using standard formula functions? 

 

Any help would be great. 

 

Thanks,

Dipesh 

1 ACCEPTED SOLUTION

Accepted Solutions
pauldeen
Level VI

Re: Extract string from name

You haven't provided an example of something with a W in it so that is untested. I think this will solve your problems:

YourList = {"(GPN) CI-4/E7/DH-1 15S-40 L1807", "DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807", "(GPN) JD CI-4/E7/DH-1 40 L1807"};
Part1 = Part2 = {};
For(i=1,i<=N items(YourList),i++,
	SubWords = Words(YourList[i], " ");
	For(k=1,k<=N Items(SubWords),k++,
		If(contains(SubWords[k],"S")>0,
			Part1[i] = SubWords[k];
			k=N Items(SubWords)+1;
		)
	);
);
For(i=1,i<=N items(Part1),i++,
	If(contains(part1[i],"W")==0,
		Part2[i] = right(part1[i],2);
	)
);

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: Extract string from name

It appears that the component you want to strip off is the second to the last grouping.  If that is the case, then the code below will do that.  The Word() function is a very powerful function.  It's documentation can be found in either the Scripting Guide or the Scripting Index

     Help==>Books==>Scripting Guide

     Help==>Scripting Index==>Word

Names Default To Here( 1 );
dt = New Table( "test",
	New Column( "String",
		character,
		values(
			{"(GPN) CI-4/E7/DH-1 15S-40 L1807", "DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807",
			"(GPN) JD CI-4/E7/DH-1 40 L1807"}
		)
	)
);
dt << New Column( "Value", character, formula( Word( -2, :String, " " ) ) );

 

Jim
Dipesh
Level I

Re: Extract string from name

Hello,
Thanks for the reply.

Unfortunately there is no commonality where the word I would like to extract is in the string - sometimes it is position 2 sometimes it is position 7!
Regards,

Dipesh
thickey1
Level III

Re: Extract string from name

Using a Regular Expression you can match a 'space' followed by some characters until you hit an 'S-' then more characters until you hit the next 'space'

 

PAT = "(GPN) CI-4/E7/DH-1 15S-40 L1807";

x = regex(PAT,"\S+(?=S-).* ");
show(x);

pauldeen
Level VI

Re: Extract string from name

You haven't provided an example of something with a W in it so that is untested. I think this will solve your problems:

YourList = {"(GPN) CI-4/E7/DH-1 15S-40 L1807", "DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807", "(GPN) JD CI-4/E7/DH-1 40 L1807"};
Part1 = Part2 = {};
For(i=1,i<=N items(YourList),i++,
	SubWords = Words(YourList[i], " ");
	For(k=1,k<=N Items(SubWords),k++,
		If(contains(SubWords[k],"S")>0,
			Part1[i] = SubWords[k];
			k=N Items(SubWords)+1;
		)
	);
);
For(i=1,i<=N items(Part1),i++,
	If(contains(part1[i],"W")==0,
		Part2[i] = right(part1[i],2);
	)
);