cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
markw
Level I

How to calculate TF-IDF

Dear all,

 

I am new to using this software and am struggling to figure out how to use it to calculate the TF-IDF values for documents in my data set. 

 

I am sure it is included in the functionality of JMP but it must be under a different name. Would anyone be able to point me in the right direction?

 

Thanks

Mark

11 REPLIES 11
Georg
Level VII

Re: How to calculate TF-IDF

you may have a look at text explorer (main menu analyse),

please see the following example as script, or in scripting index, or in statistics index (main menu help).

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Consumer Preferences.jmp" );
obj = dt << Text Explorer( TextColumns( :Reasons Not to Floss ) );
Georg
markw
Level I

Re: How to calculate TF-IDF

Thank you for your reply Georg. Please note I am a complete beginner so in need of a very basic explanation...

In practice I have a set of x documents, for 2 of which I would like to calculate the TF-IDF.

When I go to text explorer, can I get JMP to calculate the TF-IDF through using menu options and how do I specify for which documents I would like the TF-IDF to be calculated?

mlo1
Level IV

Re: How to calculate TF-IDF

View more...
Hello

did you check out the "Local data Filter" to chose the documents you want.

mlo1_0-1607015429618.png

 

Georg
Level VII

Re: How to calculate TF-IDF

My way would be to get the text first imported into a table (JMP can Import even full Folders via multiple file Import …),

and then using text Explorer on that data table.

So this depends on type and structure of your files.

Georg
mlo1
Level IV

Re: How to calculate TF-IDF

May be it is worth to check out the following AddIn as well Text Importer - Text, PDF, Word Documents, and Powerpoint 

 

Re: How to calculate TF-IDF

The TD-IDF is one of the built-in transforms that may be applied to the document-term matrix when you save it back to the data table from the Text Explorer platform.

P_Bartell
Level VIII

Re: How to calculate TF-IDF

Have you looked at the JMP documentation, Text Explorer section? Here's a link: Text Explorer 

Much will depend on how your curate your documents...so pay attention to that first before any analytics.

Martin1407
Level I

Re: How to calculate TF-IDF

Hi Mark, 

 

I've got the same issue. Need to solve this as part as an Assignment for my Degree. Did you find a solution? 

 

Best, 

 

Martin 

markw
Level I

Re: How to calculate TF-IDF

Thanks everyone for your help, with your collective help I have managed to find the solution:

 

Under text explorer, go on save document term matrix and the option for TF-IDF is found under weighting.