I have a large data table which has a column containing char text. An example of the text description is listed below.
I need to do some descriptive statistics on the duration to open/close for the robot.
I would like to extract the numeric portion of the character string and put it in a new column with numeric data type.
Is there a formula function I could use to do that on a brand new column ?
Sample text
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "1.530000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.030000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.010000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.510000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.215000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.215000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.160000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.160000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "1.815000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "1.815000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.140000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.140000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "2.955000" Secs
Hi JayV.,
The Word() function will also take an argument to specify the delimiters, so you can avoid the Substr() part of your formula and just use:
Num( Word( 11, :Text, "\!" " ) )
The \! is the escape string to tell the parser that the " is part of the quoted string.
-Jeff
Depends upon EXACTLY what you want. Does the following come close?
data have;
input string & $100.;
cards;
Valve Movement to "CLOSE" position Completed. Time Taken for Movement "1.530000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.030000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.010000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.510000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.215000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "2.215000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.160000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.160000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "1.815000" Secs
: Valve Movement to "CLOSE" position Completed. Time Taken for Movement "1.815000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.140000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "3.140000" Secs
: Valve Movement to "OPEN" position Completed. Time Taken for Movement "2.955000" Secs
;
data want (drop=string both);
set have;
both=compress(substr(string,anydigit(string)),'"');
num=input(substr(both,1,anyalpha(both)-1),12.);
unit=substr(both,anyalpha(both));
run;
Arthur,
Thank you for your reply. I started reading the Char functions help manual.
I applied the following formula on my text column which seemed to work.
Num(Substr(Word(12, :Text), 2, 8))
a) Extract 12th word from Text string (i.e. "1.530000")
b) Extract substring between quotations (i.e. 1.530000)
c) Convert character to numeric using Num function (i.e. 1.53)
Try the scan function with the " as your delimiter.
scan(string, 4, '"')
Hi JayV.,
The Word() function will also take an argument to specify the delimiters, so you can avoid the Substr() part of your formula and just use:
Num( Word( 11, :Text, "\!" " ) )
The \! is the escape string to tell the parser that the " is part of the quoted string.
-Jeff
Your solution worked perfectly also.
I learnt something new with regards to escape string/delimiters. Thanks !
Hi I have a need to extract X & Y values from a text string & separate column, whats the formula? Thanks!
SITEPOS |
X24Y34 |
X24Y35 |
X24Y36 |
X24Y37 |
X24Y38 |
X24Y39 |
X25Y34 |
X25Y35 |
X25Y36 |
X25Y37 |
X25Y38 |
X25Y39 |
X26Y33 |
X26Y34 |
X26Y35 |
X26Y36 |
X26Y37 |
X26Y38 |
X27Y33 |
X27Y34 |
X27Y35 |
Can we assume that the X and Y values are always two digits? If so, here is the formula for the X:
Num(Substr(:SITEPOS, 2, 2))
And here is the formula for the Y:
Num(Substr(:SITEPOS, 5))
Note that I converted both the X and Y values into numbers rather than leaving them as text.
Hi Dan,
Thanks for prompt response. I have a mix of 2 & single digit sandwich inside the string.
SITEPOS |
X14Y1 |
X14Y2 |
X14Y3 |
X14Y4 |
X14Y5 |
X14Y6 |
X15Y1 |
X15Y2 |
X15Y3 |
X15Y4 |
X15Y5 |
X15Y6 |
X0Y20 |
X0Y21 |
X0Y22 |
X0Y23 |
X0Y24 |
X0Y25 |
X1Y20 |
X1Y21 |
X1Y22 |
X1Y23 |
X1Y24 |
X1Y25 |
X18Y20 |
X18Y21 |
X18Y22 |
X18Y23 |
X18Y24 |
X18Y25 |
X19Y20 |
X19Y21 |
X19Y22 |
X19Y23 |
Hi Dan,
Thanks for prompt response. I have a mix of 2 & single digit sandwich inside the string.
SITEPOS |
X14Y1 |
X14Y2 |
X14Y3 |
X14Y4 |
X14Y5 |
X14Y6 |
X15Y1 |
X15Y2 |
X15Y3 |
X15Y4 |
X15Y5 |
X15Y6 |
X0Y20 |
X0Y21 |
X0Y22 |
X0Y23 |
X0Y24 |
X0Y25 |
X1Y20 |
X1Y21 |
X1Y22 |
X1Y23 |
X1Y24 |
X1Y25 |
X18Y20 |
X18Y21 |
X18Y22 |
X18Y23 |
X18Y24 |
X18Y25 |
X19Y20 |
X19Y21 |
X19Y22 |
X19Y23 |