Choose Language Hide Translation Bar
Highlighted
SunnyR
Level II

Extract numbers of varying lengths from a string into different columns

I have  a column named Duration with values which represent number of days,hours, min and sec. The last m denotes seconds.

So the typical format would be : (0-365)d(0-24)h(0-60)m(0-60)m

Here are some examples.:

0d 15h 47m 43m

10d 10h 21m 54m

0d 04h 07m 32m

268d 06h 02m 33m

 

I need to extract the number of days, Hours,minutes, and sec in different columns so that I can perform calculations on it.

I tried the formula for extracting the number of days,

 Words( :Duration, "d" )[1] 

 

For extracting the the second number - I used the formula :

Substr( Item( 2, :Duration ), 1, 2 );

In the same way extracted the third and fourth number using 3 and 4 in the Item formula.

To convert these strings to numeric, I created 4 more columns;copied and pasted and changed the data type to numeric.

I understand this is a round about way to get to the result.

I was wondering if anyone could help me with a more efficient way. 

 

4 ACCEPTED SOLUTIONS

Accepted Solutions
Highlighted

Re: Extract numbers of varying lengths from a string into different columns

How about using Regex:

 

For the Day:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\1" ) )

For the Hour:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\2" ) )

For the Minute:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\3" ) )

For the Second:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\4" ) )

Each number is in a capturing group although you really don't need to capture the others. You could simplify it with these:

 

 

Num( Regex( :Duration, "(\d+)d\s\d+h\s\d+m\s\d+m", "\1" ) )
Num( Regex( :Duration, "\d+d\s(\d+)h\s\d+m\s\d+m", "\1" ) )
Num( Regex( :Duration, "\d+d\s\d+h\s(\d+)m\s\d+m", "\1" ) )
Num( Regex( :Duration, "\d+d\s\d+h\s\d+m\s(\d+)m", "\1" ) )

 

 

 

View solution in original post

Highlighted

Re: Extract numbers of varying lengths from a string into different columns

This example illustrates how to use regular expressions.

 

Names Default To Here( 1 );

cell = {"0d 15h 47m 43m", "10d 10h 21m 54m", "0d 04h 07m 32m", "268d 06h 02m 33m"};

For( i = 1, i <= N Items( cell ), i++,
	day = Num( Regex( cell[i], "(\d+)d (\d+)h (\d+)m (\d+)m", "\1" ) );
	Show( day );
);
Learn it once, use it forever!

View solution in original post

Highlighted
ZF
ZF
Level II

Re: Extract numbers of varying lengths from a string into different columns

Use Recode -Formula Column-under the New values hot spot select Advanced - Extract  Segment. Screen shots for extract d, sec,  and jmp file with saved formula for day, hour, second attached. 

Capture d.PNG

Capture_sec.PNG

View solution in original post

Highlighted

Re: Extract numbers of varying lengths from a string into different columns

@txnelson is correct. He showed you how to use it as a separate script. This expression is all that you need for a column formula:

 

Num( Regex( cell[i], "(\d+)d (\d+)h (\d+)m (\d+)m", "\1" ) )
Learn it once, use it forever!

View solution in original post

12 REPLIES 12
Highlighted

Re: Extract numbers of varying lengths from a string into different columns

How about using Regex:

 

For the Day:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\1" ) )

For the Hour:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\2" ) )

For the Minute:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\3" ) )

For the Second:

Num( Regex( :Duration, "(\d+)d\s(\d+)h\s(\d+)m\s(\d+)m", "\4" ) )

Each number is in a capturing group although you really don't need to capture the others. You could simplify it with these:

 

 

Num( Regex( :Duration, "(\d+)d\s\d+h\s\d+m\s\d+m", "\1" ) )
Num( Regex( :Duration, "\d+d\s(\d+)h\s\d+m\s\d+m", "\1" ) )
Num( Regex( :Duration, "\d+d\s\d+h\s(\d+)m\s\d+m", "\1" ) )
Num( Regex( :Duration, "\d+d\s\d+h\s\d+m\s(\d+)m", "\1" ) )

 

 

 

View solution in original post

Highlighted

Re: Extract numbers of varying lengths from a string into different columns

Attached sample table.

Highlighted
SunnyR
Level II

Re: Extract numbers of varying lengths from a string into different columns

Paul- I learnt a new function. Thanks a lot for your detailed and simplified version of the solution.!

Highlighted

Re: Extract numbers of varying lengths from a string into different columns

This example illustrates how to use regular expressions.

 

Names Default To Here( 1 );

cell = {"0d 15h 47m 43m", "10d 10h 21m 54m", "0d 04h 07m 32m", "268d 06h 02m 33m"};

For( i = 1, i <= N Items( cell ), i++,
	day = Num( Regex( cell[i], "(\d+)d (\d+)h (\d+)m (\d+)m", "\1" ) );
	Show( day );
);
Learn it once, use it forever!

View solution in original post

Highlighted
SunnyR
Level II

Re: Extract numbers of varying lengths from a string into different columns

Mark,

  I tried using your script in many different ways 

a) by using the script in the script editor and running it,

b) creating a column with name "cell" and entering the list of items in it + creating another column "day" and using the "for loop" as a formula in that column,

and other ways.

I get an error.

I tried looking around in the scripting index for help too.

 But I cannot get the script to work.  I am new to scripting. Can you please help.

Highlighted
txnelson
Super User

Re: Extract numbers of varying lengths from a string into different columns

@markbailey script is a standalone JSL.  You run it by putting it into a script window and running the script.  I did that and it worked great, giving the following results in the JMP log

day = 0;
day = 10;
day = 0;
day = 268;
Jim
Highlighted
SunnyR
Level II

Re: Extract numbers of varying lengths from a string into different columns

Thanks a lot for pointing me to the log. I saw the result there.Previously I was thinking that the result will show up in a separate window.

Highlighted

Re: Extract numbers of varying lengths from a string into different columns

@txnelson is correct. He showed you how to use it as a separate script. This expression is all that you need for a column formula:

 

Num( Regex( cell[i], "(\d+)d (\d+)h (\d+)m (\d+)m", "\1" ) )
Learn it once, use it forever!

View solution in original post

Highlighted
SunnyR
Level II

Re: Extract numbers of varying lengths from a string into different columns

Mark

     I was under the impression that the result will show up in a separate window. Now, I looked into the log- and I saw the result there,- as many times I had run the script.