Subscribe Bookmark RSS Feed

How to do I create a frequency table of all adjacent word pairs?

anderson_a_23

Community Member

Joined:

Apr 15, 2015

I have a column that contains open-ended responses to a survey question. I'd like to create a frequency table of all the adjacent word pairs that appear in the column.

I'm familiar with the "categorical analysis" which creates a frequency table of the individual words. I suspect that as part of the process of creating the frequency table the process also creates a table of all the individual words--one word per row. If I could create this table I could just join the table to itself and offset the join by 1 record. Any ideas on how to create a table with the individual words transposed?

1 REPLY
Craige_Hales

Staff

Joined:

Mar 21, 2013

Here's some starter JSL; you'll want to re-work it a bit to match your requirements.  It uses an associative array to count the frequencies of the adjacent word pairs.  It assumes you don't want to make a pair from the end of one row to the beginning of the next (among other assumptions).


dt = New Table( "survey results",


  New Column( "answers", Character, "Nominal",


    Set Values(


    {"this is a test; a test to see if this is going to see the light of day",


    "another time, or another day, or another time-this is a test",


    "duplicate. duplicate, double-double double single-this is a test"}


    )


  )


);


frequency = [=> 0]; // associative array that answers "Zero!" for non-existing key


For Each Row(


  listOfWords = Words( Uppercase( dt:answers ), " ,-.;?_" );


  For( i = 1, i < N Items( listOfWords ), i++,


    frequency[listOfWords || "_" || listOfWords[i + 1]]++;


  );


);


show(frequency);


frequency = ["A_TEST" => 4, "ANOTHER_DAY" => 1, "ANOTHER_TIME" => 2, "DAY_OR" => 1, "DOUBLE_DOUBLE" => 2, "DOUBLE_SINGLE" => 1, "DUPLICATE_DOUBLE" => 1, "DUPLICATE_DUPLICATE" => 1, "GOING_TO" => 1, "IF_THIS" => 1, "IS_A" => 3, "IS_GOING" => 1, "LIGHT_OF" => 1, "OF_DAY" => 1, "OR_ANOTHER" => 2, "SEE_IF" => 1, "SEE_THE" => 1, "SINGLE_THIS" => 1, "TEST_A" => 1, "TEST_TO" => 1, "THE_LIGHT" => 1, "THIS_IS" => 4, "TIME_OR" => 1, "TIME_THIS" => 1, "TO_SEE" => 2, => 0];

Craige