BookmarkSubscribeRSS Feed
Craige_Hales

Staff

Joined:

Mar 21, 2013

Search a string for all occurrences of a pattern

Problem

Find the positions of all occurrences of a pattern within a string. (Thanks @DonMcCormack )

Solution

Use the pattern matching functions

source = "this is a racecar and ere noon stressed desserts for Stanley Yelnats saw Anna was there";

palindrome = (Pat Regex( "[a-zA-Z][a-zA-Z ]{0,999}" )) >> firstpart + // one or more first letters
(Pat Regex( "[a-zA-Z ]{0,1}" )) + // optional middle letter
Expr( Reverse( firstpart ) ); // the tail

list = {}; // places to
positions = {}; // keep results
Pat Match(
    source,
    Pat Repeat( // repeat *is* a loop, within the source string
        (Pat Pos() >> location + // capture current position in string
            palindrome >> another + // capture the text of the palindrome match
            Pat Test( // this function runs some JSL to see if the current position matches
                Insert Into( list, another );// the JSL saves
                Insert Into( positions, location );// two answers
                1 // pattest returns 1(OK) because we are not actually testing anything...just wanted to run some JSL
            )//
        ) 
        | // above adds to list, below skips forward a letter to try again 
        Pat Len( 1 ) // this bit is more important than it looks; it matches all the text that isn't interesting
    )
);
Show( list, positions );
list = {"a racecar a", "ere", "noon", "stressed desserts", "Stanley Yelnats", "saw Anna was", "ere"};
positions = {8, 22, 26, 31, 53, 69, 84};

Discussion

PatRepeat is a pattern matching function that repeats its pattern as long as it can. It can't skip over the undesired text without the PatLen(1) alternative that lets it ignore one character at a time until another instance of the desired pattern is found.

You can use a really simple pattern too:

source = "this is a racecar and ere noon stressed desserts for Stanley Yelnats saw Anna was there";

reallysimple = "is"; 

list = {}; // places to
positions = {}; // keep results
Pat Match(
    source,
    Pat Repeat( // repeat *is* a loop, within the source string
        (Pat Pos() >> location + // capture current position in string
            reallysimple >> another + // capture the text of the match
            Pat Test( // this function runs some JSL to see if the current position matches
                Insert Into( list, another );// the JSL saves
                Insert Into( positions, location );// two answers
                1 // pattest returns 1(OK) because we are not actually testing anything...just wanted to run some JSL
            )//
        ) 
        | // above adds to list, below skips forward a letter to try again 
        Pat Len( 1 ) // this bit is more important than it looks; it matches all the text that isn't interesting
    )
);
Show( list, positions );

list = {"is", "is"};
positions = {2, 5};

 
Note that the PatPos function is used without an argument to get the matcher's cursor position (which is zero before the first character, etc). PatPos(N) is a test that succeeds if the cursor is at N.

See Also

https://community.jmp.com/t5/Uncharted/Pattern-Matching/ba-p/21005

https://www.jmp.com/support/help/14/pattern-matching.shtml
Article Labels
Article Tags
Contributors