cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
mad
mad
Level III

Find Replace Script

I have a column that contains string. I want to search through the column for a substring and delete it. I could do this through the Find/Replace dialog, but I want to do it in a script. I tried: Munger(:STATE, 1, "RX_", ""); What I expected it to do: change every occurrence of "RX_xxxxx" to "xxxxx", but it does nothing. What am I doing wrong?
1 ACCEPTED SOLUTION

Accepted Solutions
pmroz
Super User

Re: Find Replace Script

The SUBSTITUTE function, when used with a string, will replace all occurrences of a substring within a string. I didn't realize that when used with a list it only works with exact matches of the string-to-replace.

z = "aaabbbccc";
b = substitute(z, "a", "x");

Results in:

"xxxbbbccc"

So, you could use the for each row method or loop over all rows and use the SUBSTITUTE function.

View solution in original post

9 REPLIES 9
pmroz
Super User

Re: Find Replace Script

I find the SUBSTITUTE function to be a lot easier than MUNGER.

There may be a "one-liner" way, but here's what I do:

ms
Super User (Alumni) ms
Super User (Alumni)

Re: Find Replace Script

Try to combine munger() with the foreachrow() function to loop through all the rows.
mad
mad
Level III

Re: Find Replace Script

Thanks to PMroz and MS for your help. I've gone with MS's suggestion for now, but I'm wondering what the impact on runtime will be if I have a couple million rows to loop through.
ms
Super User (Alumni) ms
Super User (Alumni)

Re: Find Replace Script

I tried with 1 million rows and both method are comparably fast, completed in 3-4 seconds on my four years old Mac Book Pro. Maybe the list method won by a small margin.

However they are not directly comparable. The substitute do not find substrings, only exactly matched strings. Munger finds the substrings but only the first instance.
mad
mad
Level III

Re: Find Replace Script

Thanks. Is there a more generic function that will find/replace all occurrences of a substring within a string?
pmroz
Super User

Re: Find Replace Script

The SUBSTITUTE function, when used with a string, will replace all occurrences of a substring within a string. I didn't realize that when used with a list it only works with exact matches of the string-to-replace.

z = "aaabbbccc";
b = substitute(z, "a", "x");

Results in:

"xxxbbbccc"

So, you could use the for each row method or loop over all rows and use the SUBSTITUTE function.
mad
mad
Level III

Re: Find Replace Script

PMroz, thanks.

MS, is there a function that can be used to find the last occurrence (i.e. search right to left instead of left to right), or do I have to iterate through the string to find the last occurrence and modify the offset accordingly?
mpb
mpb
Level VII

Re: Find Replace Script

There is a "reverse" function (ex: reverse("abc") --> "cba") which might help. You would search for the reversed text occurring first in the reversed column value, replace it with the reversed replacement text and then reverse the resulting string. Don't know how that compares speedwise with iterating the forward search.

Re: Find Replace Script

You could try using pattern matching. Its abit obscure but very powerful.

Here is a simple example:

str = "RX_ABC";
prefix = "RX_";
strState = "";
pattern = PatString(prefix) + PatRem() >> strState;
p = PatMatch(str,pattern);
If (p,
show(strState)
)

This will convert RX_ABC to ABC with the result going into strState.

So for an entire column:

prefix = "RX_";
For Each Row(
pattern = PatString(prefix) + PatRem() >> strState;
p = PatMatch(Column("State")[],pattern);
If (p,
Column("State")[] = strState
)
)