Subscribe Bookmark RSS Feed

Text Explorer: Using Manage Stem Exceptions option

kritterb

Occasional Contributor

Joined:

Jul 26, 2017

In examining my term clusters in JMP Pro, I discovered that some of my stems made zero sense beyond pattern recognition.

 

For example, ration is considered a stem for the following and more:

  • acceleration
  • operation
  • administration
  • intercalibration
  • exploration

These words all contain "ration" in their spelling, but are otherwise completely different words that should not be lumped together.

 

I found my way to the Manage Stem Exceptions window, realized there was no clarification on how to enter terms/stems, looked up the documentation and found nothing specific, and am now scratching my head.


You see, I want ration to stay a stem for ration and rations, and I want to separately track the terms it's considered a stem for--possibly as stems themselves for their plural forms. The only thing I want to remove is the connection. Ideally, the stem exceptions would be some kind of stem-to-term format, but I see no examples on how to do this, if it's even possible in JMP Pro.

 

So does anyone know?

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales

Staff

Joined:

Mar 21, 2013

Solution

I think the Show Text button in the SVD Plots is at the heart of the problem; it only applies to the left-hand documents graph, not the righ-hand terms graph. It looks like the right way to show documents for the selected terms is to go back to the term list, and without changing the selection, right-click a term in the term column and pick show text from there. I'm talking with another developer here about how to make the UI work better.

Show Text only works for left-hand graphShow Text only works for left-hand graph

Craige
4 REPLIES
Craige_Hales

Staff

Joined:

Mar 21, 2013

the stem word, oper, is the same for operation, operator, operating, operated:

stem for combiningstem for combining

The trailing middle-dot on the terms tells you a suffix was removed; administration did not have a suffix removed because I chose stem for combining, not stem all terms, and there were no other administr... words to combine with administration.

Craige
kritterb

Occasional Contributor

Joined:

Jul 26, 2017

I'm looking at unstructured text, not a list of hand-selected terms. I also did not ask for an explanation of how stems work. I'm well-aware and have used them in other software packages, including the Text Miner within SAS Enterprise Miner, which does have options for "editing synonyms".

 

I'm asking for guidance on how to edit the connections in the automatically generated stem list within JMP Pro.

kritterb

Occasional Contributor

Joined:

Jul 26, 2017

When I noticed the stem variations for ration, it was through the "Show Text" option on a term cluster. 

 

I double-checked my Stemming list to see what came up under ration- and did not see the same words I found for ration in "Show Text".

 

So JMP appears to be inconsistent between stem definitions and identifying stems in the text.

 

I'm guessing, since the term cluster included docs with the unexpected variations of ration, that someone made a goof in programming the term search within the unstructured text based on a word-find with no consideration for new word spacing. This would explain why my stem list looks great but there are a ton of mix-ups in Show Text portion.

 

If I'm correct, please fix that. The clustering tool is not helpful if it's grouping completely unrelated terms. The ration example was just one of dozens I found.

Craige_Hales

Staff

Joined:

Mar 21, 2013

Solution

I think the Show Text button in the SVD Plots is at the heart of the problem; it only applies to the left-hand documents graph, not the righ-hand terms graph. It looks like the right way to show documents for the selected terms is to go back to the term list, and without changing the selection, right-click a term in the term column and pick show text from there. I'm talking with another developer here about how to make the UI work better.

Show Text only works for left-hand graphShow Text only works for left-hand graph

Craige