cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Learn how to build custom Python data connectors and further customize JMP’s Data Connector Framework with the Python Data Connector Demo, available now in the JMP Marketplace!
  • See how to create experiments to support product design and ID useful product features. Register for June 12 webinar, 2pm US Eastern Time.

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
john_madden
Level VI

Fuzzy string match

In the Recode pane, there is an option to group strings that allows fuzzy matching.

Is there a JSL function for doing fuzzy string matching? If there is, I'm having trouble finding it. Help!

John

2 ACCEPTED SOLUTIONS

Accepted Solutions
Craige_Hales
Super User

Re: Fuzzy string match

You might be able to use Shortest Edit Script to make one. The example in the scripting index assembles a string of the characters the two strings share in order.

Craige

View solution in original post

john_madden
Level VI

Re: Fuzzy string match

Because Python's string delimiter is a single-quote character (cf. JSL double-quote), it needs to be escaped. You should also put the following two lines at the beginning of the function:

Substitute Into(str1, "'", "\'");
Substitute Into(str2, "'", "\'"); 

View solution in original post

3 REPLIES 3
Craige_Hales
Super User

Re: Fuzzy string match

You might be able to use Shortest Edit Script to make one. The example in the scripting index assembles a string of the characters the two strings share in order.

Craige
john_madden
Level VI

Re: Fuzzy string match

Just in follow-up, I ran across a Python open package that provides all kind of string similarity measures. It seems to be really well-done. It's at:

 

https://github.com/luozhouyang/python-string-similarity

 

It was straightforward to write a little JSL function that wraps one of the Python functions in this package, e.g., I decided to use the Jaro-Winkler algorithm as implemented there. My function looks like this:

 

JaroWinkler = Function( {str1, str2},
	{arg, rslt},
	arg = Eval Insert(
		"\[
from strsimpy.jaro_winkler import JaroWinkler;
jarowinkler = JaroWinkler();
rslt = jarowinkler.similarity('^str1^', '^str2^')
]\"
	);
	Python Init();
	Python Submit( arg );
	rslt = Python Get( rslt );
	Python Term();
	rslt;
);


rslt = JaroWinkler( "My string", "My tsring" ); Show( rslt ); // Log displays the following: rslt = 0.974074074074074;

(I'm using Python 3.8 on Mac. strsimpy does have a dependence on numpy, which must be installed in your Python.)

 

 

 

john_madden
Level VI

Re: Fuzzy string match

Because Python's string delimiter is a single-quote character (cf. JSL double-quote), it needs to be escaped. You should also put the following two lines at the beginning of the function:

Substitute Into(str1, "'", "\'");
Substitute Into(str2, "'", "\'"); 

Recommended Articles