cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Neo
Neo
Level VI

How to stack several columns into a new column ignoring duplicate entries?

I have multi column (character, nominal) data table. I want to get the the entries/items in the last four columns into a new column in the same data table ignoring duplicate entries/items from the columns being stacked. How to achieve this in JSL?

When it's too good to be true, it's neither
1 ACCEPTED SOLUTION

Accepted Solutions
SamGardner
Level VII

Re: How to stack several columns into a new column ignoring duplicate entries?

Here are some ways to do this:

 

  • Table Operations:
    • Stack the columns in to one column using Tables > Stack
    • Use Tables > Summary with that stacked column as the Group role
  • JSL
    • Get column values into a list
      • list_of_values = column(table name, column name) << get values 
      • You can do this again on another column and insert those values into the list
        • list_of_values = Insert Into(list_of_values, column(table name, another column name) << get values)
        • repeat as necessary
      • Use an Associate Array to get the unique values
        • unique_values = (Associative Array(list_of_values)) << get keys
      • put that list of values into a new column in a  data table
Names Default To Here( 1 );

// create table with 4 columns with random letters
nr = 20;
dt = New Table( "Example", addrows( nr ) );

letterlist = {"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q",
"R", "S", "T", "U", "V", "W", "X", "Y", "Z"};

dt << New Column( "C1", character );
Column( dt, "C1" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );
dt << New Column( "C2", character );
Column( dt, "C2" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );
dt << New Column( "C3", character );
Column( dt, "C3" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );
dt << New Column( "C4", character );
Column( dt, "C4" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );

// Method 1
dtstack = dt << Stack(
	columns( :C1, :C2, :C3, :C4 ),
	Source Label Column( "Label" ),
	Stacked Data Column( "Data" ),
	Output Table( "Stacked Values" )
);

dtstack << Summary(
	Group( :Data ),
	Freq( "None" ),
	Weight( "None" ),
	output table name( "Unique Values" )
);

//Method 2
list_of_values = Column( dt, "C1" ) << get values;
Insert Into( list_of_values, Column( dt, "C2" ) << get values );
Insert Into( list_of_values, Column( dt, "C3" ) << get values );
Insert Into( list_of_values, Column( dt, "C4" ) << get values );
Print( list_of_values );
unique_values = (Associative Array( list_of_values )) << get keys;

dtunique = New Table( "Unique Values" );
dtunique << New Column( "Unique Values", character );
Column( dtunique, "Unique Values" ) << set values( unique_values );

 

 

View solution in original post

7 REPLIES 7
jthi
Super User

Re: How to stack several columns into a new column ignoring duplicate entries?

Without knowing how your data looks and and how it should look like in the end my best guess is to use combination of Associative Array() << Get Keys and Concat Items()

-Jarmo
Neo
Neo
Level VI

Re: How to stack several columns into a new column ignoring duplicate entries?

@jthi Thanks. But does this drop duplicates?

My Columns (to be stacked) have common Char entries between them but also unique entries. I want to keep each in the new column dropping the duplicates. 

When it's too good to be true, it's neither
jthi
Super User

Re: How to stack several columns into a new column ignoring duplicate entries?

Associative Arrays cannot have duplicated keys, so yes it will (it is also ordered so order might change).

-Jarmo
Neo
Neo
Level VI

Re: How to stack several columns into a new column ignoring duplicate entries?

@jthi OK. Thanks. Is this the correct syntax?

aa = Associative Array( As Column(lastCH), As Column(2ndLastCH), As Column(3rdLastCH), As Column(4rthLastCH)) << get keys();
// Where "lastCH", "2ndlastCH", "3rdlastCH" and "4rthlastCH" are column headers for the last four columns in my data table, supplied from above 
When it's too good to be true, it's neither
SamGardner
Level VII

Re: How to stack several columns into a new column ignoring duplicate entries?

Here are some ways to do this:

 

  • Table Operations:
    • Stack the columns in to one column using Tables > Stack
    • Use Tables > Summary with that stacked column as the Group role
  • JSL
    • Get column values into a list
      • list_of_values = column(table name, column name) << get values 
      • You can do this again on another column and insert those values into the list
        • list_of_values = Insert Into(list_of_values, column(table name, another column name) << get values)
        • repeat as necessary
      • Use an Associate Array to get the unique values
        • unique_values = (Associative Array(list_of_values)) << get keys
      • put that list of values into a new column in a  data table
Names Default To Here( 1 );

// create table with 4 columns with random letters
nr = 20;
dt = New Table( "Example", addrows( nr ) );

letterlist = {"A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q",
"R", "S", "T", "U", "V", "W", "X", "Y", "Z"};

dt << New Column( "C1", character );
Column( dt, "C1" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );
dt << New Column( "C2", character );
Column( dt, "C2" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );
dt << New Column( "C3", character );
Column( dt, "C3" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );
dt << New Column( "C4", character );
Column( dt, "C4" ) << set values( letterlist[J( nr, 1, Random Integer( 26 ) )] );

// Method 1
dtstack = dt << Stack(
	columns( :C1, :C2, :C3, :C4 ),
	Source Label Column( "Label" ),
	Stacked Data Column( "Data" ),
	Output Table( "Stacked Values" )
);

dtstack << Summary(
	Group( :Data ),
	Freq( "None" ),
	Weight( "None" ),
	output table name( "Unique Values" )
);

//Method 2
list_of_values = Column( dt, "C1" ) << get values;
Insert Into( list_of_values, Column( dt, "C2" ) << get values );
Insert Into( list_of_values, Column( dt, "C3" ) << get values );
Insert Into( list_of_values, Column( dt, "C4" ) << get values );
Print( list_of_values );
unique_values = (Associative Array( list_of_values )) << get keys;

dtunique = New Table( "Unique Values" );
dtunique << New Column( "Unique Values", character );
Column( dtunique, "Unique Values" ) << set values( unique_values );

 

 

Neo
Neo
Level VI

Re: How to stack several columns into a new column ignoring duplicate entries?

@SamGardner 

Both these work fine, but appends an empty row if the number of rows is not the same for every column as in the case of my actual data set.

When it's too good to be true, it's neither
SamGardner
Level VII

Re: How to stack several columns into a new column ignoring duplicate entries?

if you insert the code

 

 

remove from(unique_values, loc(unique_values,"")[1])

after the line where the unique_values list is created, that will remove the value from the list that is the empty string.

 

 

 

remove from(unique_values, 1)

should also work, since the keys are sorted alphabetically with they are requested.