- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Dealing with wildcards for joining tables
I need to join multiple tables, but the column I need to join on has wildcards for one of the tables columns. The tables are not overly large so it doens't need to be very efficient.
In the Table1 KEY column below there could be any single alphanmueric character where the question marks are. In the Brackets are OR statements, so it could be 1 or 2 or 3 or 4.
So for example it should join onto rows 1&2 of Table2 since the wildcards would match. But row 3 has a 9 where the OR pattern only has 1|2|3|4.
I was thinking of duplicating the row for each possible alphanumeric combination where the question marks are, and for each combination in the OR brackets then the join would work - not sure how to do this.
Table1
KEY | DATA |
ABC?Y?[1|2|3|4]Z | 99999 |
Table2
KEY | DATE |
ABC1YA1Z | 2019/08/14 10:10:10 PM |
ABCZYB4Z | 2019/08/14 11:11:11 PM |
ABC1YA9Z | 2019/08/14 12:12:12 PM |
Result Table:
KEY | DATE | DATA |
ABC1YA1Z | 2019/08/14 10:10:10 PM | 99999 |
ABCZYB4Z | 2019/08/14 11:11:11 PM | 99999 |
ABC1YA9Z | 2019/08/14 12:12:12PM | . |
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Dealing with wildcards for joining tables
You can convert your KEY format to regular expression format.
dt1 is Table1 and dt2 is Table2.
dt1<<new column("KEY_reg",character,formula(substitute(:KEY,"|","","?","[A-Z0-9]")));
I tried to use the update command but not succesful.
dt2<< update(with(dt1),match columns(regex(:KEY,:KEY_reg ))) //[NOT WORKING]
Here are manual way to do the job.
dt1<<new column("KEY_reg",character,formula(substitute(:KEY,"|","","?","[A-Z0-9]")));
dt2<<new column("DATA");
for(r=1,r<=ncol(dt1),r++,
r1 = dt2<<get rows where(not(is missing(regex(:KEY, dt1[r,"KEY_reg"]))));
dt2[r1,"DATA"]=dt1[r,"DATA"];
);
Thanks.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Dealing with wildcards for joining tables
Thank you Jara95 - this is very close to the solution. I made some small edits - not sure about how you were referencing column rows using this method:
dt1[r,"DATA"]
I switched to this method and it works:
dt1:DATA[r]
I also found that with Regex you can leave the OR | operators in place, no need to Substitute them out.
One final thing I changed is For loop from N Col to N Row.
So my final script looks something like this:
dt1 << New Column( "KEY_reg", character, formula( Substitute( dt1:KEY, "?", "[A-Z0-9]" ) ) );
dt2 << New Column( "DATA" );
For( r = 1, r <= N Row( dt1 ), r++,
r1 = dt2 << Get Rows Where( !Is Missing( Regex( dt2:KEY, dt1:KEY_reg[r] ) ) );
dt2:DATA[r1] = dt1:DATA[r];
);
Thanks again!