Discussions

markw · Jun 8, 2023 5:25 PM

Dear all,

I am new to using this software and am struggling to figure out how to use it to calculate the TF-IDF values for documents in my data set.

I am sure it is included in the functionality of JMP but it must be under a different name. Would anyone be able to point me in the right direction?

Thanks

Mark

Georg · Dec 3, 2020 11:43 AM

you may have a look at text explorer (main menu analyse),

please see the following example as script, or in scripting index, or in statistics index (main menu help).

Names Default To Here( 1 );
dt = Open( "$SAMPLE_DATA/Consumer Preferences.jmp" );
obj = dt << Text Explorer( TextColumns( :Reasons Not to Floss ) );

Georg

markw · Dec 3, 2020 11:50 AM

Thank you for your reply Georg. Please note I am a complete beginner so in need of a very basic explanation...

In practice I have a set of x documents, for 2 of which I would like to calculate the TF-IDF.

When I go to text explorer, can I get JMP to calculate the TF-IDF through using menu options and how do I specify for which documents I would like the TF-IDF to be calculated?

mlo1 · Dec 3, 2020 12:12 PM

Hello

did you check out the "Local data Filter" to chose the documents you want.

Georg · Dec 3, 2020 12:42 PM

My way would be to get the text first imported into a table (JMP can Import even full Folders via multiple file Import …),

and then using text Explorer on that data table.

So this depends on type and structure of your files.

Georg

mlo1 · Dec 3, 2020 01:05 PM

May be it is worth to check out the following AddIn as well Text Importer - Text, PDF, Word Documents, and Powerpoint

Mark_Bailey · Dec 3, 2020 02:22 PM

The TD-IDF is one of the built-in transforms that may be applied to the document-term matrix when you save it back to the data table from the Text Explorer platform.

P_Bartell · Dec 3, 2020 03:51 PM

Have you looked at the JMP documentation, Text Explorer section? Here's a link: Text Explorer

Much will depend on how your curate your documents...so pay attention to that first before any analytics.

Martin1407 · Dec 4, 2020 05:03 AM

Hi Mark,

I've got the same issue. Need to solve this as part as an Assignment for my Degree. Did you find a solution?

Best,

Martin

markw · Dec 4, 2020 06:25 AM

Thanks everyone for your help, with your collective help I have managed to find the solution:

Under text explorer, go on save document term matrix and the option for TF-IDF is found under weighting.

Discussions

How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Re: How to calculate TF-IDF

Recommended Articles