Subscribe Bookmark RSS Feed

Find Replace Script

mad

Community Trekker

Joined:

Jun 23, 2011

I have a column that contains string. I want to search through the column for a substring and delete it. I could do this through the Find/Replace dialog, but I want to do it in a script. I tried: Munger(:STATE, 1, "RX_", ""); What I expected it to do: change every occurrence of "RX_xxxxx" to "xxxxx", but it does nothing. What am I doing wrong?
9 REPLIES
pmroz

Super User

Joined:

Jun 23, 2011

I find the SUBSTITUTE function to be a lot easier than MUNGER.

There may be a "one-liner" way, but here's what I do:

ms

Super User

Joined:

Jun 23, 2011

Try to combine munger() with the foreachrow() function to loop through all the rows.
mad

Community Trekker

Joined:

Jun 23, 2011

Thanks to PMroz and MS for your help. I've gone with MS's suggestion for now, but I'm wondering what the impact on runtime will be if I have a couple million rows to loop through.
ms

Super User

Joined:

Jun 23, 2011

I tried with 1 million rows and both method are comparably fast, completed in 3-4 seconds on my four years old Mac Book Pro. Maybe the list method won by a small margin.

However they are not directly comparable. The substitute do not find substrings, only exactly matched strings. Munger finds the substrings but only the first instance.
mad

Community Trekker

Joined:

Jun 23, 2011

Thanks. Is there a more generic function that will find/replace all occurrences of a substring within a string?
pmroz

Super User

Joined:

Jun 23, 2011

The SUBSTITUTE function, when used with a string, will replace all occurrences of a substring within a string. I didn't realize that when used with a list it only works with exact matches of the string-to-replace.

z = "aaabbbccc";
b = substitute(z, "a", "x");

Results in:

"xxxbbbccc"

So, you could use the for each row method or loop over all rows and use the SUBSTITUTE function.
mad

Community Trekker

Joined:

Jun 23, 2011

PMroz, thanks.

MS, is there a function that can be used to find the last occurrence (i.e. search right to left instead of left to right), or do I have to iterate through the string to find the last occurrence and modify the offset accordingly?
mpb

Super User

Joined:

Jun 23, 2011

There is a "reverse" function (ex: reverse("abc") --> "cba") which might help. You would search for the reversed text occurring first in the reversed column value, replace it with the reversed replacement text and then reverse the resulting string. Don't know how that compares speedwise with iterating the forward search.
You could try using pattern matching. Its abit obscure but very powerful.

Here is a simple example:

str = "RX_ABC";
prefix = "RX_";
strState = "";
pattern = PatString(prefix) + PatRem() >> strState;
p = PatMatch(str,pattern);
If (p,
show(strState)
)

This will convert RX_ABC to ABC with the result going into strState.

So for an entire column:

prefix = "RX_";
For Each Row(
pattern = PatString(prefix) + PatRem() >> strState;
p = PatMatch(Column("State")[],pattern);
If (p,
Column("State")[] = strState
)
)