gail_massari
Community Manager Community Manager

Can I remove terms from the Document Term Matrix after it is run during text analysis?

Some background: Text Explorer analysis options are based on the document term matrix (DTM).  A term or token is the smallest piece of text, similar to a word in a sentence. A document is the collection of words in a cell. Each row in the DTM corresponds to a document (a cell in a text column of a JMP data table). Each column in the DTM corresponds to a term from the curated term list. Analysis ignores word ordering. In its simplest form, each cell of the DTM contains the frequency (number of occurrences) of the column’s term in the row’s document.

 

To remove terms after analysis: After running Text Explorer (Analyze>Text Explorer>then select Text Column(s) to analyze), you can remove terms from the Terms and Phrase List .  To remove, select the term(s) from the Terms List, R-click, then Add Stop Word. 

 

You can save the new Document Term Matrix to the data table.  That will add a column for each term to the data table, useful for subsequent analysis. 

 

(view in My Videos)
Terms are in left column of Term and Phrase List.Terms are in left column of Term and Phrase List.

 Save DTM to data table before and/or after editing terms.Save DTM to data table before and/or after editing terms.
 
Article Labels

    There are no labels assigned to this post.

Article Tags
Contributors