Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Submit your abstract to the call for content for Discovery Summit Americas by April 23. Selected abstracts will be presented at Discovery Summit, Oct. 21- 24.
Super User
Backtracking Secrets


The aphorism "If at first you don't succeed, try, try again" is formalized in pattern matching in a process called backtracking. Backtracking is how the matcher checks out all the possible combinations that might match. This post uses some secret features in JMP's pattern matcher to expose the backtracking process. The single statement below is a moderately complicated pattern match.  But even simple matches (and regex) will use backtracking.

Secret 1: the Log() function on line 21 produces output in the JMP log window. The >> before it is taking the previously matched text and printing it, along with a zero-based position in the source and the text label supplied. This has nothing to do with backtracking and everything to do with debugging. You can use this secret to debug your own patterns.

Secret 2: PatTest() has three arguments, even though only one is ever used. This is all about backtracking. The JSL in the second argument is executed in the forward-tracking direction and the JSL in the third argument is executed in the backtracking direction.

Secret 3: PatMatch() normally optimizes failure, as best it can. When it sees there won't be enough text remaining in the source string to match the remaining minimum length of the pattern, it stops. The FULLSCAN option turns off that optimization. For this example, the backtracking will seem to stop prematurely without the FULLSCAN option.

Secret 4: PatFail() can be used to find all possible matches by forcing each match to fail and alternatives to be considered.

Here's the code and output with an explanation below.

Pat Match(
  "EDACGDE", // a contrived string to exercise backtracking
    Pat Len( 1 ) >> p // match a character, store in p
    Pat Test(
      p > "B", // test
      Write( "p=", p, " " ), // advance
      Write( "retry\!n" ) // backtrack
    Pat Arb() // skip zero or more characters
    Pat Len( 1 ) >> q // match another, store in q
    Pat Test(
      p < q & q < "F", // test
      Write( "q=", q, " " ), // advance
      Write( "retry " ) // backtrack
  ) >> Log( "FOUND" ) // a debugging tool
  Pat Fail(), // try alternatives
  NULL, // no replacement
  FULLSCAN // to the bitter end

p=E q=D retry q=A retry q=C retry q=G retry q=D retry q=E retry retry

p=D q=A retry q=C retry q=G retry q=D retry q=E


retry retry

p=A retry

p=C q=G retry q=D


retry q=E


retry retry

p=G q=D retry q=E retry retry

p=D q=E


retry retry

p=E retry



First, the final zero. That's the return code, and it says the match failed. The match actually FOUND four matches, but the PatFail() caused it to look for another. This is expected.

Next, the Ps and Qs. p starts at E and the test for >B is happy. PatArb() is reluctant and initially skips nothing. q gets the D following E. But the test is unhappy this time; q between p and "F" is not true. Time to backtrack! The LEN(1) that put the D into q gives the D back and the PatArb() is given a chance to do something different. This time it reluctantly takes one character, the D. The PatLen(1) then gives the A to q. At the end of the first line, there are two retries; the second is for p. The next line shows p getting the second character, D. Since the pattern match is not anchored, the beginning of the match is allowed to float forward through the text as part of the backtracking process; this time the match goes forward with the PatArb taking more and more until the first success happens. The >>Log("FOUND") records the success in the log, then the PatFail() causes the backtracking to continue.

If it isn't clear, you can try simpler tests and instrument them with the >>LOG("tag") method to get a feel for what's happening.

At this point it should be clear why this match runs for a very, very long time:


while you can see it can't possibly succeed because the string of has no b anywhere, the matcher has to discover that by trying lots of silly combinations: skip 2 characters followed by skip 3... and skip 3 followed by skip 2... etc, etc, etc.... Backtracking can't fix a badly written pattern. That progress bar does not mean it is almost finished. (There is no reason to put the patarb inside the patrepeat, except to see this dialog. Patarb already matches an arbitrary number of characters.)

update 1Feb2017 - repair formatting for new community.

Last Modified: Feb 1, 2017 8:04 AM