Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Get a list of the unique values in a column, matrix, or list

Problem 1:  I need to get a list of the unique values in a data table column.

Problem 2:  I need to get a list of the unique values in a JSL data structure (list or matrix)


For problem 1:

Solution 1:  Use summary tables.  The JSL below shows how to use Summary() to get a table that will have a row with each unique value in a single column.   You can then do further JSL table commands to get those values from the summary table. 


_dt = Open( "$sample_data/big" );
//create a summary table of the :sex column
_dt1 = _dt << Summary(
	Group( :sex ),
	Freq( "None" ),
	Weight( "None" )
//create a summary table of the :age column
_dt2 = _dt << Summary(
	Group( :age ),
	Freq( "None" ),
	Weight( "None" )

2-13-2018 5-22-08 PM.png

Problem 2:  
Now,  instead of a data table you have a JMP list or matrix:
sexlist = {"F", "M", "M", "F", "F", "M", "M", "F", "M"};
agelist = [14, 11, 12, 13, 15, 17, 15, 11, 12];
the complicated ways to do this would be:
a) put the values in the list into a new data table column, and then use the summary approach, OR
b) write a somewhat complicated for loop to search for and record unique values
Fortunately, there is an easier way, using associative arrays.  
Associative Array(sexlist) << Get Keys;
Associative Array(agelist) << Get Keys;
returns {"F", "M"} and {11, 12, 13, 14, 15, 17}, respectively (note that these are nice sorted list!).  If you would like the numeric values as a sorted vector, you can use
Matrix(Associative Array(agelist) << Get Keys);
which returns [11, 12, 13, 14, 15, 17].  
This approach can also be used in place of Solution 1 for Problem 1:
Problem1, Solution 2:
Associative Array(Column(_dt, "sex") << Get Values) << Get Keys;
Matrix(Associative Array(Column(_dt, "age")<< Get Values)<<Get Keys);
returns {"F", "M"} and [11, 12, 13, 14, 15, 17]., respectively. What this is doing is 
  • get the values from the specified column as a list (if the column is character) or a matrix (if the column is numeric)
  • turn that list or martrix into an associative array
  • get the keys for the associative array, which in this case are the unique levels of the values in the column, returning that as a list
  • (if Matrix() is used on the resulting list, it will only work if the values in the list are all numbers).

Summarize() is way much more efficient than associative array, it is very obvious with 2 million rows of data (1sec vs 8sec time lapse). If performance is critical, use Summarize() whenever possible.

Summarize( dt, smrz1 = by( :sex ) ); Summarize( dt, smrz2 = by( :age ) );

* Summarize() does not create intermediate data table as << Summary


Associative Array is great in the way that it saves the effort to put the list/matrix into data table, this is perfect solution for small data size.


JSL Cookbook

If you’re looking for a code snippet or design pattern that performs a common task for your JSL project, the JSL Cookbook is for you.

This knowledge base contains building blocks of JSL code that you can use to reduce the amount of coding you have to do yourself.

It's also a great place to learn from the experts how to use JSL in new ways, with best practices.