cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Djtjhin
Level IV

Combining Data from multiple tables

I'm trying to create a new master table by combining data from two tables (or more). As an example, I'm trying to combine Table A & Table B below into Table A-B as the result:

Djtjhin_0-1620273011098.pngDjtjhin_1-1620273026277.pngDjtjhin_2-1620273044453.png

Basically both table A & B have similar column in "start" & "end" but different range for each row. What I'm trying to do is to combine the segments from starting distance 1 to 20 from both table and combine them into table A-B. 

Any idea on how to do this through either scripting or the Tables functions ?

2 ACCEPTED SOLUTIONS

Accepted Solutions
jthi
Super User

Re: Combining Data from multiple tables

This can give an idea how you could do this with scripting:

 

Names Default To Here(1);

dt1 = New Table("Table A",
	Add Rows(4),
	Compress File When Saved(1),
	New Column("Start",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([1, 5, 10, 15])
	),
	New Column("End",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([5, 10, 15, 20])
	),
	New Column("Name", Character, "Nominal", Set Values({"A", "B", "C", "D"}))
);

dt2 = New Table("Table B",
	Add Rows(5),
	Compress File When Saved(1),
	New Column("Start",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([1, 4, 8, 12, 16])
	),
	New Column("End",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([4, 8, 12, 16, 20])
	),
	New Column("Name",
		Character,
		"Nominal",
		Set Values({"X1", "X2", "X3", "X4", "X5"})
	)
);

//create collection table
dt_final = New Table("Table A-B",
	Compress File When Saved(1),
	New Column("Start", Numeric, "Continuous", Format("Best", 12)),
	New Column("End", Numeric, "Continuous", Format("Best", 12)),
	New Column("Name", Character, "Nominal")
);

For Each Row(dt1,
	startVal = Column(dt1, "Start")[Row()];
	endVal = Column(dt1, "End")[Row()];
	nameVal = Column(dt1, "Name")[Row()];
	r_dt2 = dt2 << get rows where(startVal <= :Start <= endVal | startVal <= :End <= endVal);
	For(i = 1, i <= N Items(r_dt2), i++,
		newEnd = Min(Column(dt2, "End")[r_dt2[i]], endVal);
		newName = nameVal || " " ||Column(dt2, "Name")[r_dt2[i]];
		newRow = EvalList({startVal, newEnd, newName});
		dt_final << Add rows({Start = newRow[1], End = newRow[2], Name = newRow[3]});
		startVal = newEnd;
		endVal= Max(Column(dt2, "End")[r_dt2[i]], endVal);
	);
);

Close(dt1, no save);
Close(dt2, no save);
-Jarmo

View solution in original post

Re: Combining Data from multiple tables

EDIT: I re-wrote this to make it much shorter, more efficient and free of the integer-based limitations of my original response. It is also straightforward to adapt the code to handle more than 2 source tables.

////

 

 

You can also tackle this with JSL's version of a table-lookup, using Loc Sorted().

 

Cheers,

Brady

 

Names Default To Here( 1 );

dt1 = New Table( "Table A",
	Add Rows( 4 ),
	Compress File When Saved( 1 ),
	New Column( "Start", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [1, 5, 10, 15] ) ),
	New Column( "End", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [5, 10, 15, 20] ) ),
	New Column( "Name", Character, "Nominal", Set Values( {"A", "B", "C", "D"} ) )
);

dt2 = New Table( "Table B",
	Add Rows( 5 ),
	Compress File When Saved( 1 ),
	New Column( "Start", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [1, 4, 8, 12, 16] ) ),
	New Column( "End", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [4, 8, 12, 16, 20] ) ),
	New Column( "Name", Character, "Nominal", Set Values( {"X1", "X2", "X3", "X4", "X5"} ) )
);

//////////// Begin processing here...
ends = Matrix( Associative Array( Concat( As List( dt1:End << get values ), As List( dt2:End << get values ) ) ) << get keys );
starts = [1] |/ ends[1 :: (N Row( ends ) - 1)];
dt3 = As Table( starts || ends, <<column names( {"Start", "End"} ) );
dt1Starts = dt1:Start << get values;
dt2Starts = dt2:Start << get values;
dt3 << New Column( "Name ID",
	formula( dt1:Name[Loc Sorted( dt1Starts, :Start )][1] || " " || dt2:Name[Loc Sorted( dt2Starts, :Start )][1] )
);
dt3:Name ID << Delete Formula;

 

View solution in original post

4 REPLIES 4
jthi
Super User

Re: Combining Data from multiple tables

This can give an idea how you could do this with scripting:

 

Names Default To Here(1);

dt1 = New Table("Table A",
	Add Rows(4),
	Compress File When Saved(1),
	New Column("Start",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([1, 5, 10, 15])
	),
	New Column("End",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([5, 10, 15, 20])
	),
	New Column("Name", Character, "Nominal", Set Values({"A", "B", "C", "D"}))
);

dt2 = New Table("Table B",
	Add Rows(5),
	Compress File When Saved(1),
	New Column("Start",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([1, 4, 8, 12, 16])
	),
	New Column("End",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Set Values([4, 8, 12, 16, 20])
	),
	New Column("Name",
		Character,
		"Nominal",
		Set Values({"X1", "X2", "X3", "X4", "X5"})
	)
);

//create collection table
dt_final = New Table("Table A-B",
	Compress File When Saved(1),
	New Column("Start", Numeric, "Continuous", Format("Best", 12)),
	New Column("End", Numeric, "Continuous", Format("Best", 12)),
	New Column("Name", Character, "Nominal")
);

For Each Row(dt1,
	startVal = Column(dt1, "Start")[Row()];
	endVal = Column(dt1, "End")[Row()];
	nameVal = Column(dt1, "Name")[Row()];
	r_dt2 = dt2 << get rows where(startVal <= :Start <= endVal | startVal <= :End <= endVal);
	For(i = 1, i <= N Items(r_dt2), i++,
		newEnd = Min(Column(dt2, "End")[r_dt2[i]], endVal);
		newName = nameVal || " " ||Column(dt2, "Name")[r_dt2[i]];
		newRow = EvalList({startVal, newEnd, newName});
		dt_final << Add rows({Start = newRow[1], End = newRow[2], Name = newRow[3]});
		startVal = newEnd;
		endVal= Max(Column(dt2, "End")[r_dt2[i]], endVal);
	);
);

Close(dt1, no save);
Close(dt2, no save);
-Jarmo

Re: Combining Data from multiple tables

EDIT: I re-wrote this to make it much shorter, more efficient and free of the integer-based limitations of my original response. It is also straightforward to adapt the code to handle more than 2 source tables.

////

 

 

You can also tackle this with JSL's version of a table-lookup, using Loc Sorted().

 

Cheers,

Brady

 

Names Default To Here( 1 );

dt1 = New Table( "Table A",
	Add Rows( 4 ),
	Compress File When Saved( 1 ),
	New Column( "Start", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [1, 5, 10, 15] ) ),
	New Column( "End", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [5, 10, 15, 20] ) ),
	New Column( "Name", Character, "Nominal", Set Values( {"A", "B", "C", "D"} ) )
);

dt2 = New Table( "Table B",
	Add Rows( 5 ),
	Compress File When Saved( 1 ),
	New Column( "Start", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [1, 4, 8, 12, 16] ) ),
	New Column( "End", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [4, 8, 12, 16, 20] ) ),
	New Column( "Name", Character, "Nominal", Set Values( {"X1", "X2", "X3", "X4", "X5"} ) )
);

//////////// Begin processing here...
ends = Matrix( Associative Array( Concat( As List( dt1:End << get values ), As List( dt2:End << get values ) ) ) << get keys );
starts = [1] |/ ends[1 :: (N Row( ends ) - 1)];
dt3 = As Table( starts || ends, <<column names( {"Start", "End"} ) );
dt1Starts = dt1:Start << get values;
dt2Starts = dt2:Start << get values;
dt3 << New Column( "Name ID",
	formula( dt1:Name[Loc Sorted( dt1Starts, :Start )][1] || " " || dt2:Name[Loc Sorted( dt2Starts, :Start )][1] )
);
dt3:Name ID << Delete Formula;

 

Djtjhin
Level IV

Re: Combining Data from multiple tables

@jthi & @brady_brady Thanks for the responses. I'm sure the solutions provided work but I want to digest them first and I'll respond back here if they meet my intent. Appreciate the help! Keep you posted.

Re: Combining Data from multiple tables

After some thinking, I found a way to do this interactively. Rather than describe it, I've attached a short video I made that discusses the process. The interactive approach has the advantage that it is the same amount of work (which is not much) no matter how many tables you want to combine. In my example, I use 3. The disadvantage is that it requires some non-intuitive sorting; if you make an error in sorting, you'll get incorrect results.

 

Run this script to generate the tables I use in the example:

 

Names Default To Here(1);

dta = new table("A",
          <<addrows(6),
          <<new column("Start", formula(8*(row()-1))),
          <<new column("End", formula(:Start + 8)),
          <<new column("A", character, formula("A" || char(row())))
);
dtb = new table("B",
          <<addrows(12),
          <<new column("Start", formula(4*(row()-1))),
          <<new column("End", formula(:Start + 4)),
          <<new column("B", character, formula("B" || char(row())))
);
dtc = new table("C",
          <<addrows(8),
          <<new column("Start", formula(6*(row()-1))),
          <<new column("End", formula(:Start + 6)),
          <<new column("C", character, formula("C" || char(row())))
);