Subscribe Bookmark RSS Feed

making a join more based on a list of common vars

Highlighted
newbie_alex

Contributor

Joined:

May 18, 2017

Hello,

 

I would like to replace the hard coded column names in this example:

 

dt = dt1 << Join(
	With( dt_ref ),
	Select(
		:result_Rx_RSSI_,
		:result_snr_Rx_RSSI_,
		:type_,
		:rat_,
		:trx_,
	),
	SelectWith( :result_Rx_RSSI_ref, :result_snr_Rx_RSSI_ref, ),
	By Matching Columns(
		:type_ = :type_,
		:rat_ = :rat_,
		:trx_ = :trx_,
	), 
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 ),
	Output Table( "merged_results" )
);

with something that starts with

common_vars = {"type_", "rat_", "trx};
dt1_only_vars = {"result_Rx_RSSI_", "result_snr_Rx_RSSI_"};
dt_ref_only_vars = {"result_Rx_RSSI_ref", "result_snr_Rx_RSSI_ref"};

... and this should lead to the same result.

 

How do I do that?

 

Thank you for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
newbie_alex

Contributor

Joined:

May 18, 2017

Solution

Thank you for the feedback.

 

This is how I solved it thanks to this blog entry:

 

http://www.pega-analytics.co.uk/blog/handling-lists-of-columns/

 

common_vars = {"type_", "rat_", "trx"};
dt1_only_vars = {"result_Rx_RSSI_", "result_snr_Rx_RSSI_"};
dt_ref_only_vars = {"result_Rx_RSSI_ref", "result_snr_Rx_RSSI_ref"};

col_names = dt1_only_vars || common_vars;

match_str = "";
For( i = 1, i <= N Items( common_vars ), i++,
	match_col_name = common_vars[i];
	mstr = Eval Insert( ":^match_col_name^ = :^match_col_name^," );
	match_str = match_str || mstr;
);

// this is UGLY but I do not know any other way (yet?)
dt_str = Eval Insert(
	"dt = dt1 << Join(
	With( dt_ref ),
	Select(
		Eval( col_names )

	),
	SelectWith( Eval(dt_ref_only_vars) ),
	By Matching Columns(
		^match_str^
	), 
	
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 ),
	Output Table( \!"merged_results\!" ));"
);
Eval( Parse( dt_str ) );
6 REPLIES
kevin_c_anderso

Community Trekker

Joined:

Jun 5, 2014

Hi, Alex!

There are many ways to do that.

Can you post some (anonymized, if necessary,) dt1 and dt_ref datasets?
newbie_alex

Contributor

Joined:

May 18, 2017

dt = Open( "$SAMPLE_DATA/Decathlon.jmp" );

dt_no_score = dt << Subset(
	Output Table( "Decathlon_no_score" ),
	All rows,
	columns(
		:Name,
		:Country,
		:Name( "100m" ),
		:Long Jump,
		:Shot Put,
		:High Jump,
		:Name( "400m" ),
		:Name( "100m hurdles" ),
		:Discus,
		:Pole Vault,
		:Javelin,
		:Name( "1500m" )
	)
);

dt_no_country = dt << Subset(
	Output Table( "Decathlon_no_country" ),
	All rows,
	columns(
		:Name,
		:Score,
		:Name( "100m" ),
		:Long Jump,
		:Shot Put,
		:High Jump,
		:Name( "400m" ),
		:Name( "100m hurdles" ),
		:Discus,
		:Pole Vault,
		:Javelin,
		:Name( "1500m" )
	)
);

dt_recombined = dt_no_country << Join(
	With( dt_no_score ),
	Select(
		:Name,
		:Score,
		:Name( "100m" ),
		:Long Jump,
		:Shot Put,
		:High Jump,
		:Name( "400m" ),
		:Name( "100m hurdles" ),
		:Discus,
		:Pole Vault,
		:Javelin,
		:Name( "1500m" )
	),
	SelectWith( :Country ),
	By Matching Columns(
		:Name = :Name,
		:Name( "100m" ) = :Name( "100m" ),
		:Long Jump = :Long Jump,
		:Shot Put = :Shot Put,
		:High Jump = :High Jump,
		:Name( "400m" ) = :Name( "400m" ),
		:Name( "100m hurdles" ) = :Name( "100m hurdles" ),
		:Discus = :Discus,
		:Pole Vault = :Pole Vault,
		:Javelin = :Javelin,
		:Name( "1500m" ) = :Name( "1500m" )
	),
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 )
)
kevin_c_anderso

Community Trekker

Joined:

Jun 5, 2014

Hi, Alex!

 

First, thank you for making your example much more difficult to type than your original question!  :)  

 

I'm unclear on why you wish to use a list of common variables for a join.  Do the variables change?  Would User Dialogs be useful?  Your reasons may influence the direction you should take.

 

Nonetheless, let's consider that someone will have to maintain the "reference values" in your No Score dataset, and it would be helpful to record where No Country dataset values have no No Score values...that would indicate a Left Outer Join would be appropriate.

 

dt_recombined = Query( 
	Table( dt_no_country, "t1" ),
	Table( dt_no_score, "t2" ),
  "SELECT t1.Name AS [Name from Table1], t1.Score, t1.[100m], t1.[Long Jump], t1.[Shot Put], t1.[High Jump], t1.[400m], t1.[100m hurdles], t1.Discus, t1.[Pole Vault], t1.Javelin, t1.[1500m], 
    t2.Name AS [Name from Table2], t2.Country  
   FROM t1  
	LEFT OUTER JOIN t2  
	ON  ( t1.Name = t2.Name and
			t1.[100m] = t2.[100m] and
			t1.[Long Jump] = t2.[Long Jump] and
			t1.[Shot Put] = t2.[Shot Put] and
			t1.[High Jump] = t2.[High Jump] and
			t1.[400m] = t2.[400m] and
			t1.[100m hurdles] = t2.[100m hurdles] and
			t1.Discus = t2.Discus and
			t1.[Pole Vault] = t2.[Pole Vault] and
			t1.Javelin = t2.Javelin and
			t1.[1500m] = t2.[1500m]  ) " ); 

If you really need, you can easily turn the variables into expressions you can evaluate and substitute at will.

newbie_alex

Contributor

Joined:

May 18, 2017

Solution

Thank you for the feedback.

 

This is how I solved it thanks to this blog entry:

 

http://www.pega-analytics.co.uk/blog/handling-lists-of-columns/

 

common_vars = {"type_", "rat_", "trx"};
dt1_only_vars = {"result_Rx_RSSI_", "result_snr_Rx_RSSI_"};
dt_ref_only_vars = {"result_Rx_RSSI_ref", "result_snr_Rx_RSSI_ref"};

col_names = dt1_only_vars || common_vars;

match_str = "";
For( i = 1, i <= N Items( common_vars ), i++,
	match_col_name = common_vars[i];
	mstr = Eval Insert( ":^match_col_name^ = :^match_col_name^," );
	match_str = match_str || mstr;
);

// this is UGLY but I do not know any other way (yet?)
dt_str = Eval Insert(
	"dt = dt1 << Join(
	With( dt_ref ),
	Select(
		Eval( col_names )

	),
	SelectWith( Eval(dt_ref_only_vars) ),
	By Matching Columns(
		^match_str^
	), 
	
	Drop multiples( 0, 0 ),
	Include Nonmatches( 0, 0 ),
	Preserve main table order( 1 ),
	Output Table( \!"merged_results\!" ));"
);
Eval( Parse( dt_str ) );
ian_jmp

Staff

Joined:

Jun 23, 2011

Perhaps I have oversimplified your case but, FYI, this also works:

dt1 = NewTable("One",
		NewColumn("A", Values([1,2])),
		NewColumn("B", Values([3,4])),
		NewColumn("K1", Values([5,6])),
		NewColumn("K2", Values([7,8]))
		);

dt2 = NewTable("Two",
		NewColumn("C", Values([9,10])),
		NewColumn("D", Values([11,12])),
		NewColumn("K2", Values([5,6])),
		NewColumn("K1", Values([7,8]))
		);

dt1KeyCols = {"K1", "K2"};
dt2KeyCols = {"K2", "K1"};
dtFinal = dt1 << Join(
					With(dt2),
					By Matching Columns( dt1KeyCols = dt2KeyCols ),
					Output Table( "Joined" )
					);
newbie_alex

Contributor

Joined:

May 18, 2017

Thank you! This is much easier to read than my approach.