turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Extract string from name

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Jun 8, 2017 3:50 AM
(2267 views)

Hi,

I would like to extract a string of data from a name. The position of the string varies in the name and the name has a variable number of sub-strings within it. An example is shown below:

(GPN) CI-4/E7/DH-1 15S-40 L1807

DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807

(GPN) JD CI-4/E7/DH-1 40 L1807

I am interested in two parts: a string which contains the letter S (so in the examples, 15S-40 & 5S-40) AND if w is not present, I am interested in the numeric value.

Is it possible to extract this using standard formula functions?

Any help would be great.

Thanks,

Dipesh

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

You haven't provided an example of something with a W in it so that is untested. I think this will solve your problems:

```
YourList = {"(GPN) CI-4/E7/DH-1 15S-40 L1807", "DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807", "(GPN) JD CI-4/E7/DH-1 40 L1807"};
Part1 = Part2 = {};
For(i=1,i<=N items(YourList),i++,
SubWords = Words(YourList[i], " ");
For(k=1,k<=N Items(SubWords),k++,
If(contains(SubWords[k],"S")>0,
Part1[i] = SubWords[k];
k=N Items(SubWords)+1;
)
);
);
For(i=1,i<=N items(Part1),i++,
If(contains(part1[i],"W")==0,
Part2[i] = right(part1[i],2);
)
);
```

4 REPLIES

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

It appears that the component you want to strip off is the second to the last grouping. If that is the case, then the code below will do that. The Word() function is a very powerful function. It's documentation can be found in either the Scripting Guide or the Scripting Index

Help==>Books==>Scripting Guide

Help==>Scripting Index==>Word

```
Names Default To Here( 1 );
dt = New Table( "test",
New Column( "String",
character,
values(
{"(GPN) CI-4/E7/DH-1 15S-40 L1807", "DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807",
"(GPN) JD CI-4/E7/DH-1 40 L1807"}
)
)
);
dt << New Column( "Value", character, formula( Word( -2, :String, " " ) ) );
```

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Hello,

Thanks for the reply.

Unfortunately there is no commonality where the word I would like to extract is in the string - sometimes it is position 2 sometimes it is position 7!

Regards,

Dipesh

Thanks for the reply.

Unfortunately there is no commonality where the word I would like to extract is in the string - sometimes it is position 2 sometimes it is position 7!

Regards,

Dipesh

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Using a Regular Expression you can match a 'space' followed by some characters until you hit an 'S-' then more characters until you hit the next 'space'

PAT = "(GPN) CI-4/E7/DH-1 15S-40 L1807";

x = regex(PAT,"\S+(?=S-).* ");

show(x);

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

You haven't provided an example of something with a W in it so that is untested. I think this will solve your problems:

```
YourList = {"(GPN) CI-4/E7/DH-1 15S-40 L1807", "DEO (GPN) JD CI-4/E7/DH-1 5S-40 L1807", "(GPN) JD CI-4/E7/DH-1 40 L1807"};
Part1 = Part2 = {};
For(i=1,i<=N items(YourList),i++,
SubWords = Words(YourList[i], " ");
For(k=1,k<=N Items(SubWords),k++,
If(contains(SubWords[k],"S")>0,
Part1[i] = SubWords[k];
k=N Items(SubWords)+1;
)
);
);
For(i=1,i<=N items(Part1),i++,
If(contains(part1[i],"W")==0,
Part2[i] = right(part1[i],2);
)
);
```