Discussions

RA899 · Mar 20, 2024 07:03 PM

Hi all,

I'm writing a jsl script to find the repeated string entries (i.e. rows) in a column. The column is basically listing different tests that usually is repeated.

What I'm trying to find is that if I can find the number of times those rows have been repeated, select the top N repeated entries, and then insert those selected entries into a list to be analyzed by graph builder or tabulate.

Any help is appreciated.

txnelson · Mar 20, 2024 07:59 PM

Here is how I would approach this

names default to here(1);
dt=current data table();

// Get each different string value and the number of repeated strings
summarize(dt, byGroup = by(:sex), number = count(:height));
show( byGroup, number );

// To find which rows that have the same values
theRows = dt << get rows where( :sex == byGroup[1] );
show(theRows);

Jim

View solution in original post

jthi · Mar 21, 2024 03:06 PM

You can use Summarize to find count of items and then to find top 5 you can for example use Reverse(Rank()) on the count and use those found indices on your groups

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Big Class.jmp");

// Get each different string value and the number of repeated strings
Summarize(dt, uniq = by(:age), count = Count(:height)); // some column which is numeric to Count

r = Reverse(Rank(count));
top5 = Try(uniq[r][1::5], uniq[r]);

show(uniq, count, top5);

-Jarmo

View solution in original post

txnelson · Mar 20, 2024 07:59 PM

Here is how I would approach this

names default to here(1);
dt=current data table();

// Get each different string value and the number of repeated strings
summarize(dt, byGroup = by(:sex), number = count(:height));
show( byGroup, number );

// To find which rows that have the same values
theRows = dt << get rows where( :sex == byGroup[1] );
show(theRows);

Jim

RA899 · Mar 21, 2024 01:26 PM

thank you @txnelson for your solution. This works perfectly in find the duplicate columns and how many times they were duplicated. The issue I'm having now is that I'm trying to find the top N (ex: N =5) repeated strings and throw them in a list for further evaluation. I was only being able to find the top item in the list.

My question, is there are a way to find the top N items where N can be any integer? Thanks!

I have modified the code as seen below:

names default to here(1);
dt=current data table();
// Get each different string value and the number of repeated strings
summarize(dt, byGroup = by(:Temp_), number = count(:Site));


list = {};
list[1] = 0;
// To find which rows that have the same values
for(i=2, i <N Items(byGroup),i++,
 list[i] = N Items ( theRows = dt << get rows where( :Temp_ == byGroup[i] ) )   ; // creating a list of how many times those items were repeated. 
 //numoffail = N items(list);
  buf = max(list); // finding how many times the top repeated items were repeated. 
 	highest =byGroup[loc(list,buf)]; // match the number of repetition to the repeated string to find the most repeated  
);

jthi · Mar 21, 2024 01:27 AM

Could you provide example (and solution to the example) so it is easier to understand what you are trying to achieve as at least three different cases comes to my mind.

-Jarmo

RA899 · Mar 21, 2024 02:40 PM

Hi @jthi.

An example of that would be Sex column in big class jmp data. There is F and M in the column, and it is repeated multiple times. I would like to write a jsl script that counts how many times F or M was repeated. The code that @txnelson provided works perfectly in this case. Does this make more sense ? Thanks

jthi · Mar 21, 2024 03:06 PM

You can use Summarize to find count of items and then to find top 5 you can for example use Reverse(Rank()) on the count and use those found indices on your groups

Names Default To Here(1);

dt = Open("$SAMPLE_DATA/Big Class.jmp");

// Get each different string value and the number of repeated strings
Summarize(dt, uniq = by(:age), count = Count(:height)); // some column which is numeric to Count

r = Reverse(Rank(count));
top5 = Try(uniq[r][1::5], uniq[r]);

show(uniq, count, top5);

-Jarmo

Discussions

Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Re: Finding how many times an entry has been repeated within a column using JSL

Recommended Articles