cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • JMP will suspend normal business operations for our Winter Holiday beginning on Wednesday, Dec. 24, 2025, at 5:00 p.m. ET (2:00 p.m. ET for JMP Accounts Receivable).
    Regular business hours will resume at 9:00 a.m. EST on Friday, Jan. 2, 2026.
  • We’re retiring the File Exchange at the end of this year. The JMP Marketplace is now your destination for add-ins and extensions.

Discussions

Solve problems, and share tips and tricks with other JMP users.
%3CLINGO-SUB%20id%3D%22lingo-sub-263742%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3ECoincidencia%20de%20cadena%20difusa%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-263742%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3E%3CP%3EEn%20el%20panel%20Recodificar%2C%20hay%20una%20opci%C3%B3n%20para%20agrupar%20cadenas%20que%20permite%20la%20coincidencia%20aproximada.%3C%2FP%3E%3CP%3E%C2%BFExiste%20una%20funci%C3%B3n%20JSL%20para%20hacer%20coincidencias%20de%20cadenas%20aproximadas%3F%20Si%20lo%20hay%2C%20estoy%20teniendo%20problemas%20para%20encontrarlo.%20%C2%A1Ayudar!%3C%2FP%3E%3CP%3EJohn%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-264779%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3ERe%3A%20Coincidencia%20de%20cadenas%20aproximadas%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-264779%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3E%3CP%3EDebido%20a%20que%20el%20delimitador%20de%20cadena%20de%20Python%20es%20un%20car%C3%A1cter%20de%20comillas%20simples%20(cf.%20comillas%20dobles%20de%20JSL)%2C%20debe%20escaparse.%20Tambi%C3%A9n%20debe%20poner%20las%20siguientes%20dos%20l%C3%ADneas%20al%20comienzo%20de%20la%20funci%C3%B3n%3A%3C%2FP%3E%3CPRE%3E%3CCODE%20class%3D%22%20language-jsl%22%3ESubstitute%20Into(str1%2C%20%22'%22%2C%20%22%5C'%22)%3B%0ASubstitute%20Into(str2%2C%20%22'%22%2C%20%22%5C'%22)%3B%3C%2FCODE%3E%26nbsp%3B%3C%2FPRE%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-264591%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3ERe%3A%20Coincidencia%20de%20cadenas%20aproximadas%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-264591%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3E%3CP%3EJusto%20en%20el%20seguimiento%2C%20me%20encontr%C3%A9%20con%20un%20paquete%20abierto%20de%20Python%20que%20proporciona%20todo%20tipo%20de%20medidas%20de%20similitud%20de%20cadenas.%20Parece%20estar%20muy%20bien%20hecho.%20Est%C3%A1%20en%3A%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%3CA%20href%3D%22https%3A%2F%2Fgithub.com%2Fluozhouyang%2Fpython-string-similarity%23damerau-levenshtein%22%20target%3D%22_blank%22%20rel%3D%22noopener%20nofollow%20noreferrer%22%3Ehttps%3A%2F%2Fgithub.com%2Fluozhouyang%2Fpython-string-similaridad%3C%2FA%3E%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3EFue%20sencillo%20escribir%20una%20peque%C3%B1a%20funci%C3%B3n%20JSL%20que%20incluye%20una%20de%20las%20funciones%20de%20Python%20en%20este%20paquete%2C%20por%20ejemplo%2C%20decid%C3%AD%20usar%20el%20algoritmo%20Jaro-Winkler%20tal%20como%20se%20implement%C3%B3%20all%C3%AD.%20Mi%20funci%C3%B3n%20se%20ve%20as%C3%AD%3A%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CPRE%3E%3CCODE%20class%3D%22%20language-jsl%22%3EJaroWinkler%20%3D%20Function(%20%7Bstr1%2C%20str2%7D%2C%0A%20%7Barg%2C%20rslt%7D%2C%0A%20arg%20%3D%20Eval%20Insert(%0A%20%20%22%5C%5B%0Afrom%20strsimpy.jaro_winkler%20import%20JaroWinkler%3B%0Ajarowinkler%20%3D%20JaroWinkler()%3B%0Arslt%20%3D%20jarowinkler.similarity('%5Estr1%5E'%2C%20'%5Estr2%5E')%0A%5D%5C%22%0A%20)%3B%0A%20Python%20Init()%3B%0A%20Python%20Submit(%20arg%20)%3B%0A%20rslt%20%3D%20Python%20Get(%20rslt%20)%3B%0A%20Python%20Term()%3B%0A%20rslt%3B%0A)%3B%0A%3CBR%20%2F%3E%3CBR%20%2F%3E%0Arslt%20%3D%20JaroWinkler(%20%22My%20string%22%2C%20%22My%20tsring%22%20)%3B%0AShow(%20rslt%20)%3B%20%20%2F%2F%20Log%20displays%20the%20following%3A%20rslt%26nbsp%3B%3D%26nbsp%3B0.974074074074074%3B%3CBR%20%2F%3E%3C%2FCODE%3E%3C%2FPRE%3E%3CP%3E(Estoy%20usando%20Python%203.8%20en%20Mac.%20strsimpy%20depende%20de%20numpy%2C%20que%20debe%20estar%20instalado%20en%20su%20Python).%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3CP%3E%26nbsp%3B%3C%2FP%3E%3C%2FLINGO-BODY%3E%3CLINGO-SUB%20id%3D%22lingo-sub-263755%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3ERe%3A%20Coincidencia%20de%20cadenas%20aproximadas%3C%2FLINGO-SUB%3E%3CLINGO-BODY%20id%3D%22lingo-body-263755%22%20slang%3D%22en-US%22%20mode%3D%22NONE%22%3E%3CP%3EEs%20posible%20que%20puedas%20usar%3CEM%3E%20Script%20de%20edici%C3%B3n%20m%C3%A1s%20corto%3C%2FEM%3E%20para%20hacer%20uno%20El%20ejemplo%20en%20el%20%C3%ADndice%20de%20secuencias%20de%20comandos%20ensambla%20una%20cadena%20de%20caracteres%20que%20comparten%20las%20dos%20cadenas%20en%20orden.%3C%2FP%3E%3C%2FLINGO-BODY%3E
Choose Language Hide Translation Bar
john_madden
Level VI

Fuzzy string match

In the Recode pane, there is an option to group strings that allows fuzzy matching.

Is there a JSL function for doing fuzzy string matching? If there is, I'm having trouble finding it. Help!

John

2 ACCEPTED SOLUTIONS

Accepted Solutions
Craige_Hales
Super User

Re: Fuzzy string match

You might be able to use Shortest Edit Script to make one. The example in the scripting index assembles a string of the characters the two strings share in order.

Craige

View solution in original post

john_madden
Level VI

Re: Fuzzy string match

Because Python's string delimiter is a single-quote character (cf. JSL double-quote), it needs to be escaped. You should also put the following two lines at the beginning of the function:

Substitute Into(str1, "'", "\'");
Substitute Into(str2, "'", "\'"); 

View solution in original post

3 REPLIES 3
Craige_Hales
Super User

Re: Fuzzy string match

You might be able to use Shortest Edit Script to make one. The example in the scripting index assembles a string of the characters the two strings share in order.

Craige
john_madden
Level VI

Re: Fuzzy string match

Just in follow-up, I ran across a Python open package that provides all kind of string similarity measures. It seems to be really well-done. It's at:

 

https://github.com/luozhouyang/python-string-similarity

 

It was straightforward to write a little JSL function that wraps one of the Python functions in this package, e.g., I decided to use the Jaro-Winkler algorithm as implemented there. My function looks like this:

 

JaroWinkler = Function( {str1, str2},
	{arg, rslt},
	arg = Eval Insert(
		"\[
from strsimpy.jaro_winkler import JaroWinkler;
jarowinkler = JaroWinkler();
rslt = jarowinkler.similarity('^str1^', '^str2^')
]\"
	);
	Python Init();
	Python Submit( arg );
	rslt = Python Get( rslt );
	Python Term();
	rslt;
);


rslt = JaroWinkler( "My string", "My tsring" ); Show( rslt ); // Log displays the following: rslt = 0.974074074074074;

(I'm using Python 3.8 on Mac. strsimpy does have a dependence on numpy, which must be installed in your Python.)

 

 

 

john_madden
Level VI

Re: Fuzzy string match

Because Python's string delimiter is a single-quote character (cf. JSL double-quote), it needs to be escaped. You should also put the following two lines at the beginning of the function:

Substitute Into(str1, "'", "\'");
Substitute Into(str2, "'", "\'"); 

Recommended Articles