In a previous blog post, I showed how to use the Extract Segment feature in Recode in JMP 15 to extract a number embedded in a string. Recode also makes it easy to use a regular expression to perform the same task. Depending on the form of the data, it may be easier to use one tool versus the other.
Step 1. We’ll start off the same way, by invoking Recode on the Observation ID column.
Step 2. From the red triangle menu, chose Replace String…
Step 3. We are best off sticking to simple regular expressions. In this case, we are searching for the letters “WK”, followed by one or more digits. \d matches a digit, and the plus sign means “1 or more”.
Step 4. Looking at our replacement values, we can tell that the regular expression found WK1, but replaced it with nothing. We can use \1 as the replacement text to replace the match with the first capture group. We don’t want the entire string, only the numeric portion, so we add parentheses to create a capture group.
Step 5. Now, we have the number (ex. 1) in the place where the matched string was before (ex. “WK1”). However, all we want is the digit. There’s a new checkbox in Recode that replaces the entire string with the replacement value.
Step 6. Now we have our data trimmed down to just the WK number as a string. We can continue by parsing the string into a number as before. If we save the result as a column formula, we’ll get the following formula.
The resulting column looks like this:
Regular expressions are a valuable tool to have in your toolbox. With JMP 15, you can combine regular expressions with other tools in Recode to perform more advanced data preparation.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.