Nice! a few changes, explained below.
input = "CATE_N_C1_Shotp,CATE_P_C1_Shotp,CATE_P_C1_Shotp";// input string
// Define boundary characters as _ and ,
boundaryChars = "_,";
// Associative array to count word frequency
wordsToCount = [=> ];
// Match pattern in the input
rc = Pat Match(
Lowercase( input ) || boundaryChars, // pre-normalize and add sentinel at end
Pat Pos( 0 )
+ Pat Repeat(
Pat Break( boundaryChars ) >> word + Pat Span( boundaryChars )
+ Pat Test(
If( Contains( wordsToCount, word ),
wordsToCount[word] = wordsToCount[word] + 1,
wordsToCount[word] = 1
);
1; // explicitly, result of pattest is 'true'
)
)
+ Pat R Pos( 0 ) // The pattern must reach the end
);
// Identify common words (those that appear in all elements)
elements = Words( input, "," );
totalElements = N Items( elements );
commonWords = [=> ];
For Each( {{word, count}}, wordsToCount, If( count == totalElements, commonWords[word] = count ) );
// Display the common words and their counts
Show( commonWords );// commonWords = ["c1" => 3, "cate" => 3, "shotp" => 3];
;
The main change is using PatRepeat to walk through the input one token (word) at a time, and adding a sentinel separator at the end. Your original code walks the string by trying to match only one word, then discovering the word is not at the end of the string, advancing the start of the match by one character and trying to match again. It misses the final "shotp" because there is no final separator. The work in pattest is simplified by pre-lowercasing at the same time the sentinel is added.
Pretty sure someone will propose a solution using the words() function, which won't need a sentinel.
edit: Somehow I missed this sentinel from the past.
Craige