- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Count number of occurrences of specific words in a string
Hello I'm looking to create a new column in my dataset that counts the number of times a specfic word appears in a string. The words are separated by "*" in each row.
Example table:
Animal List
1 dog*dog*cat*bird*dog
2 dog*dog*cat*bird*dog*dog
Output should be
Animal List Dog Count
1 dog*dog*cat*bird*dog 3
2 dog*dog*cat*bird*dog*dog 4
Can provide additional information if needed.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
Here is the formula for counting the number of times "dog" appears in a given column
N Rows( Loc( Words( :Column 1, "*" ), "dog" ) )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
Here is the formula for counting the number of times "dog" appears in a given column
N Rows( Loc( Words( :Column 1, "*" ), "dog" ) )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
How would this work for wildcard words? For example if dog had dog1 and another said dog2? I still would want it to count all of the dog words.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
The following formula will find all items with the string "dog" found in it
wordList=words(st,"*");
count=0;for(i=1,i<=n items(wordList),i++,count=sum(count,contains(wordList[i],"dog")));
count;
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
Does the function Loc() exist only in JMP Pro?
Is there a way to count words or specific symbols in a string in regular JMP?
@txnelson wrote:Here is the formula for counting the number of times "dog" appears in a given column
N Rows( Loc( Words( :Column 1, "*" ), "dog" ) )
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
The documentation for JMP 13 does not indicate the LOC() function is only a JMP Pro function. I would validate that by running the LOC() function example in the Scripting Index
Help==>Scripting Index
Names Default To Here( 1 );
Show( Loc( [1 0 1 0 1 0] ) );
Show( Loc( {"A", 2, 3, 2, 5, 2, 4, [1 5]}, 2 ) );
Show(
Loc( {"A", 2, 3, 2, 5, 2, 4, [1 5]}, [1 5] )
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
I can confirm that the Loc() function is not specific to JMP Pro. It is available in JMP.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
Hi
What if there is no delimiter, e.g. I'd like to count how many "1" in a binary string 111001010?
How can I get the results like this?
Thanks
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Count number of occurrences of specific words in a string
Many ways to do this. Here's one.
ASCIIcode = blobtomatrix(chartoblob("1"),"int",1,"big")[1]; // 49
intermediate=blobtomatrix(chartoblob("0101111001010"),"int",1,"big")==ASCIIcode;
// intermediate=[0, 1, 0, 1, 1, 1, 1, 0, 0, 1, 0, 1, 0]
sum(intermediate);
Line 1 is just a way to get the ASCII code for the ASCII character "1", returned in a matrix of 1 element.
Line 2 is similar, but gets a bigger array and compares it to the desired code, resulting in the value in the Line 3 comment.
Line 4 just adds up the elements.