- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
How to calculate TF-IDF
Dear all,
I am new to using this software and am struggling to figure out how to use it to calculate the TF-IDF values for documents in my data set.
I am sure it is included in the functionality of JMP but it must be under a different name. Would anyone be able to point me in the right direction?
Thanks
Mark
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
you may have a look at text explorer (main menu analyse),
please see the following example as script, or in scripting index, or in statistics index (main menu help).
Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Consumer Preferences.jmp" );
obj = dt << Text Explorer( TextColumns( :Reasons Not to Floss ) );
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
In practice I have a set of x documents, for 2 of which I would like to calculate the TF-IDF.
When I go to text explorer, can I get JMP to calculate the TF-IDF through using menu options and how do I specify for which documents I would like the TF-IDF to be calculated?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
My way would be to get the text first imported into a table (JMP can Import even full Folders via multiple file Import …),
and then using text Explorer on that data table.
So this depends on type and structure of your files.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
May be it is worth to check out the following AddIn as well Text Importer - Text, PDF, Word Documents, and Powerpoint
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
The TD-IDF is one of the built-in transforms that may be applied to the document-term matrix when you save it back to the data table from the Text Explorer platform.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
Have you looked at the JMP documentation, Text Explorer section? Here's a link: Text Explorer
Much will depend on how your curate your documents...so pay attention to that first before any analytics.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
Hi Mark,
I've got the same issue. Need to solve this as part as an Assignment for my Degree. Did you find a solution?
Best,
Martin
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: How to calculate TF-IDF
Thanks everyone for your help, with your collective help I have managed to find the solution:
Under text explorer, go on save document term matrix and the option for TF-IDF is found under weighting.