cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
john_madden
Level VI

How do I combine tables

 

I have two tables, A and B. I want to write JSL that will find all the rows in A whose ID is represented one or more times in B, and then append those rows to B to yield a new table C. 

 

For example, let this be A:

ID

Data

Day

1

Red

Sun

2

Orange

Mon

2

Yellow

Tue

3

Blue

Wed

4

Indigo

Thu

4

Violet

Fri

4

Fuchsia

Sat

5

Teal

Sun

6

Black

Mon

 

And let this be B:

ID

Data

Day

6

Aqua

Tue

1

Pink

Wed

1

Beige

Thu

4

White

Fri

9

Gold

Sat

10

Silver

Sun

 

I want code to generate a table C that looks like this (row order doesn't matter):

ID

Data

Day

6

Aqua

Tue

1

Pink

Wed

1

Beige

Thu

4

White

Fri

9

Gold

Sat

10

Silver

Sun

1

Red

Sun

4

Indigo

Thu

4

Violet

Fri

4

Fuchsia

Sat

6

Black

Mon

 

This seems simple, but I've confused myself as to the best way to do it. The method should scale well with respect to speed and memory for A and B tables of >1 million rows.

2 ACCEPTED SOLUTIONS

Accepted Solutions
cwillden
Super User (Alumni)

Re: How do I combine tables

I think this should be relatively computationally efficient, and you can don't have to do it exactly this way.  Basic idea is to create a table that is a subset of the Table A where the IDs are also in Table B.  Then concatenate the subset of A to Table B.

 

dtA = Data Table("Table A");
dtB = Data Table("Table B");

b.ID = dtB:ID << Get Values;
b_ID_unique = associative array(b.ID) << Get Keys;

dtA << Select Where(Contains(b_ID_unique,:ID) > 0);

dtASelected = dtA << Subset( Selected Rows );

dtB << Concatenate(
dtASelected,
Output Table( "Table C" )
);

Close(dtASelected, No Save);

Other approach might be to hide and exclude the data that doesn't match the criteria.  This eliminates the need to create a subset table, but the hidden and excluded data come along with the concatenation.  You could select the hidden and excluded rows and delete them, but that might not be a very robust solution if you have other rows hidden and excluded that you wouldn't want to delete.

Third option is make a column formula indicating if row in A has an ID that exists in B, then delete the resulting rows in C that aren't a match based on the values in that column.

I'm sure there are numerous other ways to skin this cat, but hopefully this will do the trick.

 

-- Cameron Willden

View solution in original post

txnelson
Super User

Re: How do I combine tables

Here is a Join centric version of a script that will work....it should be pretty efficient

names default to here(1);
dtA = data table("A");
dtB = data table("B");

// Create data tables with only the data required
dtAID = dtA << Subset(invisible, selected rows(0), columns(:ID));
dtBID = dtB << Subset(invisible, selected rows(0), columns(:ID));

// Create a data table with only IDs found in both tables
dtABID = dtAID << Join(invisible, with(dtBID ),
	Merge Same Name Columns,
	By Matching Columns( :ID = :ID ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 )
);

close( dtAID, nosave );
close( dtBID, nosave );

// Create a table with only 1 row for each ID found
dtSum = dtABID << Summary(invisible, 
	Group( :ID ),
	Freq( "None" ),
	Weight( "None" )
);

// Create a data table that only has the correct IDs
// from data table A
dtAMatches = dtA << Join(invisible, 
	with(dtSum ),
	Merge Same Name Columns,
	By Matching Columns( :ID = :ID ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 )
);

close( dtSum, nosave );

// Delete column Match Flag and N Rows
dtAMatches << delete columns("Match Flag", "N Rows");

// Concatenate table B with the matched data from table A
dtFinal = dtB << Concatenate( dtAMatches );

close( dtAMatches, nosave );

Jim

View solution in original post

4 REPLIES 4
cwillden
Super User (Alumni)

Re: How do I combine tables

I think this should be relatively computationally efficient, and you can don't have to do it exactly this way.  Basic idea is to create a table that is a subset of the Table A where the IDs are also in Table B.  Then concatenate the subset of A to Table B.

 

dtA = Data Table("Table A");
dtB = Data Table("Table B");

b.ID = dtB:ID << Get Values;
b_ID_unique = associative array(b.ID) << Get Keys;

dtA << Select Where(Contains(b_ID_unique,:ID) > 0);

dtASelected = dtA << Subset( Selected Rows );

dtB << Concatenate(
dtASelected,
Output Table( "Table C" )
);

Close(dtASelected, No Save);

Other approach might be to hide and exclude the data that doesn't match the criteria.  This eliminates the need to create a subset table, but the hidden and excluded data come along with the concatenation.  You could select the hidden and excluded rows and delete them, but that might not be a very robust solution if you have other rows hidden and excluded that you wouldn't want to delete.

Third option is make a column formula indicating if row in A has an ID that exists in B, then delete the resulting rows in C that aren't a match based on the values in that column.

I'm sure there are numerous other ways to skin this cat, but hopefully this will do the trick.

 

-- Cameron Willden
txnelson
Super User

Re: How do I combine tables

Here is a Join centric version of a script that will work....it should be pretty efficient

names default to here(1);
dtA = data table("A");
dtB = data table("B");

// Create data tables with only the data required
dtAID = dtA << Subset(invisible, selected rows(0), columns(:ID));
dtBID = dtB << Subset(invisible, selected rows(0), columns(:ID));

// Create a data table with only IDs found in both tables
dtABID = dtAID << Join(invisible, with(dtBID ),
	Merge Same Name Columns,
	By Matching Columns( :ID = :ID ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 )
);

close( dtAID, nosave );
close( dtBID, nosave );

// Create a table with only 1 row for each ID found
dtSum = dtABID << Summary(invisible, 
	Group( :ID ),
	Freq( "None" ),
	Weight( "None" )
);

// Create a data table that only has the correct IDs
// from data table A
dtAMatches = dtA << Join(invisible, 
	with(dtSum ),
	Merge Same Name Columns,
	By Matching Columns( :ID = :ID ),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 )
);

close( dtSum, nosave );

// Delete column Match Flag and N Rows
dtAMatches << delete columns("Match Flag", "N Rows");

// Concatenate table B with the matched data from table A
dtFinal = dtB << Concatenate( dtAMatches );

close( dtAMatches, nosave );

Jim
john_madden
Level VI

Re: How do I combine tables

The associative array option is quite neat and I haven't used this concept very much, and never as a kind of alternative to a join. The option using the join is great to see because that was the approach that I was trying to work out for myself, but got confused.

 

I'm grateful to both of you for confirming that while relatively simple, this doesn't have a "one-step" or unique solution.

 

Thanks to both of you!

 

John

john_madden
Level VI

Re: How do I combine tables

The associative array option is quite neat and I haven't used this concept very much, and never as a kind of alternative to a join. The option using the join is great to see because that was the approach that I was trying to work out for myself, but got confused.

 

I'm grateful to both of you for confirming that while relatively straightforward, this doesn't have a "one-step" or unique solution.

 

Thanks to both of you!

 

John