cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Voizingu
Level III

Need to join two tables based on non-identical strings (Use Regex?) in JSL

Hello, 

I have a JSL table imported from a tester station. 

The station reports the metrics and some attributes, including the Config Code of the devices under test.

 

My goal is to join another table (config matrix) with some categorical info (component vendor, …), based on the config code.

Unfortunately the config code reported by the station is not a 100% match with the config matrix (case sensitive, missing letter, extra “-“,…)

I would like to find a robust way to match them, and I think the Regex function is the best one, but I am a bit overwhelmed.

I checked a lot of posts on this forum talking about this function but I haven’t found yet a use-case close enough to mine to get me started.

Thanks in advance for your help 

 

-Voiz

 

Below an example of 2 columns to be matched, I used a third column to remove the "-" with substitute(), and a 4th column to compare but the match is still weak on most cases.

 

image.png

 

New Table( "untitled 1",
	Add Rows( 8 ),
	New Column( "Tester config output",
		Character,
		"Nominal",
		Set Values(
			{"U4002LKR1B", "4005FGG1A", "U3001PL-2jR", "U5011G1A", "5001BLG1D",
			"U5003R2A", "U4005BLG1B", "U6034UBLG2A"}
		),
		Set Display Width( 114 )
	),
	New Column( "config matrix name",
		Character,
		"Nominal",
		Set Values(
			{"U4002-LK-R1B", "U4005-FG-G1A", "U3001-PL-2JR", "U5011-G1A",
			"U5001-BL-G1D", "U5003-R2A", "U4005-BL-G1B", "U6034-UBL-G2A"}
		)
	),
	New Column( "Clean Config Matrix Name",
		Character,
		"Nominal",
		Formula( Substitute( :config matrix name, "-", "" ) )
	),
	New Column( "Column 4",
		Character,
		"Nominal",
		Formula( If( :Clean Config Matrix Name == :Tester config output, "Match" ) )
	)
)

 

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Need to join two tables based on non-identical strings (Use Regex?) in JSL

I might be incorrect on what you are trying to do, but this is simplified version of what could be done

Names Default To Here(1);

dt1 = New Table("Output",
	New Column("Tester config output",
		Character,
		"Nominal",
		Set Values(
			{"U4002LKR1B", "4005FGG1A", "U3001PL-2jR", "U5011G1A", "5001BLG1D", "U5003R2A",
			"U4005BLG1B", "U6034UBLG2A"}
		)
	)
);

dt2 = New Table("Config",
	New Column("config matrix name",
		Character,
		"Nominal",
		Set Values(
			{"U4002-LK-R1B", "U4005-FG-G1A", "U3001-PL-2JR", "U5011-G1A", "U5001-BL-G1D",
			"U5003-R2A", "U4005-BL-G1B", "U6034-UBL-G2A"}
		)
	),
	New Column("R", Numeric, Nominal, Set Values(1::8))
);

Summarize(dt2, groups = By(:config matrix name));

dt1 << new column("N", Character, Nominal);
For Each Row(dt1,
	:N = Choose Closest(:Tester config output, groups, Ignore Case(1), Ignore Whitespace(1), Ignore Nonprintable(1), Max Edit Count(2), Max Edit Ratio(0.1));
);

dt1 << Update(
	With(dt2),
	Match Columns(:N = :config matrix name),
	Replace Columns in Main Table(None)
);

Write();
-Jarmo

View solution in original post

3 REPLIES 3
jthi
Super User

Re: Need to join two tables based on non-identical strings (Use Regex?) in JSL

Unless you can specify all the rules which should be followed I would suggest using something like Choose Closest function (or Shortest Edit Script). This post can maybe give some ideas Script the "Group Similar Values" function in the Recode platform with JSL . 

-Jarmo
jthi
Super User

Re: Need to join two tables based on non-identical strings (Use Regex?) in JSL

I might be incorrect on what you are trying to do, but this is simplified version of what could be done

Names Default To Here(1);

dt1 = New Table("Output",
	New Column("Tester config output",
		Character,
		"Nominal",
		Set Values(
			{"U4002LKR1B", "4005FGG1A", "U3001PL-2jR", "U5011G1A", "5001BLG1D", "U5003R2A",
			"U4005BLG1B", "U6034UBLG2A"}
		)
	)
);

dt2 = New Table("Config",
	New Column("config matrix name",
		Character,
		"Nominal",
		Set Values(
			{"U4002-LK-R1B", "U4005-FG-G1A", "U3001-PL-2JR", "U5011-G1A", "U5001-BL-G1D",
			"U5003-R2A", "U4005-BL-G1B", "U6034-UBL-G2A"}
		)
	),
	New Column("R", Numeric, Nominal, Set Values(1::8))
);

Summarize(dt2, groups = By(:config matrix name));

dt1 << new column("N", Character, Nominal);
For Each Row(dt1,
	:N = Choose Closest(:Tester config output, groups, Ignore Case(1), Ignore Whitespace(1), Ignore Nonprintable(1), Max Edit Count(2), Max Edit Ratio(0.1));
);

dt1 << Update(
	With(dt2),
	Match Columns(:N = :config matrix name),
	Replace Columns in Main Table(None)
);

Write();
-Jarmo
Voizingu
Level III

Re: Need to join two tables based on non-identical strings (Use Regex?) in JSL

Hi Jarmo,

 

thanks a lot for your feedback,

I tried this piece of script on a larger dataset and it seems to be accurate enough, it matches >96% of the time.

I added "Replace Unmatched(1), Unmatched (value="NoMatch")" to the arguments of Choose Closest () to make sure it flags when there is no match.

 

Appreciate your help once again

 

Voiz