- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Column Formula to convert Unicode to symbol
I want to make JMP table that will have a column with the unicode shortcut for a symbol and then have JMP display the symbol (wiht a formula) in another column. In the screen shot below, I would like the column Displayed to have formula that uses the column Unicode as an input and that shows the displayed symbol result.
The table below was created by manually typing the Alt+Code.
I can't figure out how to do that in a column formula. Any help would be appreciated
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Column Formula to convert Unicode to symbol
something like
beta = "\!U0392";
theta = "\!U0398";
Or maybe
Open(
"http://www.fileformat.info/info/unicode/block/greek_and_coptic/list.htm",
HTML Table( 1, Column Names( 1 ), Data Starts( 2 ) )
);
which makes
Table loaded from URL
I like that site for finding characters; make JMP's \!U version from theirs
http://www.fileformat.info/info/unicode/char/3b2/index.htm
They show U+0392, JMP uses \!U0392
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Column Formula to convert Unicode to symbol
Or, you might need something like this
txt = "038F";
code = Hex To Number( txt );
mat = J( 1, 1, code );
blob = Matrix To Blob( mat, "int", 2, "little" );
string = Blob To Char( blob, "utf-16" );
Show( string );
string = "Ώ";
which, if your browser supports it, looks like http://www.fileformat.info/info/unicode/char/038f/index.htm
The JSL above takes the printable hex representation in txt, gets the number for it, makes a 1-element matrix (could be more if you have more) for Matrix to Blob, and uses the blob holding the number of the character in UTF-16 form to make an actual Unicode character.
Or, you could put the hex into a JSL string with the \!U and parse it.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Column Formula to convert Unicode to symbol
Hi @Craige_Hales ,
Thanks for the info on this, I found it quite interesting and helpful.
One thing I don't get though is how to force the \!U command with the hex. Let's say that you have a column like from the website that you reference earlier, where the entries are all U+hex, e.g. Beta = U+03D0
You can split up the text in the column with something like
Word(1,:Character,"+")||Word(2,:Character,"+")
Which would give you the output U03D0.
But, when I try to put that in a Parse() command with the \!U in front, JMP complains at me. I get that Parse("\!U03D0") gives beta. What's challenging me is that I can't concatenate the \!U to the hex code. For example, I would try
Parse("\!U"||Word(1,:Character,"+")||Word(2,:Character,"+"));
JMP complains. What I'd like the stuff inside the Parse command to be is literally "\!U03D0" so that when it runs parse on it, it actually evaluates it and returns beta. I'm just trying to think of how to do this as a column formula rather than a separate JSL script.
Thanks!,
DS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
+Re: Column Formula to convert Unicode to symbol
escapes!
hex = "03D0";
uni = "\[\!U]\";
// or
// uni = "\!\!U";
quotationmark = "\!"";
beta = eval(parse( quotationmark || uni || hex || quotationmark ));
Building the string \!U requires escapes so JMP won't try to find hex after the U, before you put it there. That happens at parse time, not at run time, so once the string is built, in uni above, it is just normal text. Until parse scans the string again. But to do that we need a valid snippet of JSL, which will be "\!U03d0", including the quotation marks. To get a quotation mark, more escapes! The string parses, then evaluates to a Unicode character. The result of the eval is the Unicode character, and it is assigned to beta.
The first escape uses \[ ... ]\ to mean "don't interpret anything in between". The second one uses \!\ to make a literal \, then the ! U are just regular characters.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: +Re: Column Formula to convert Unicode to symbol
A wonderful example and explanation of what happens at parse time versus run time.
Stan
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: +Re: Column Formula to convert Unicode to symbol
Hi @Craige_Hales ,
Thanks for the explanation on how to force the "\!" escape. I had tried different options, but wasn't getting them to work quite right, but that explanation helped to see how things needed to be modified with both the escape and concatenate.
Thanks!,
DS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
回复: +Re: Column Formula to convert Unicode to symbol
What if I want to list all of these Unicode characters in JSL?
Total 65535
Digits in base 10 are converted to base 16 first?
Thank Craige!
i=65535;
he=hex(i);
txt = eval(parse( "\!"\!\!U" || he || "\!"" ));
But it doesn't seem right.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
回复: +Re: Column Formula to convert Unicode to symbol
OK、IS this?
Thanks!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
回复: +Re: Column Formula to convert Unicode to symbol
Looks right. I'd do it with the matrix-blob conversion like this
linelength = 32; // break lines every 32 characters
mat = 0 :: 65535; // useful to start at 0, but null is a problem...
mat[1] = Hex To Number( "2400" ); // fixup null character
mat = Shape( mat, N Cols( mat ) / linelength ); // reshape to concatenate new lines
mat = mat || J( N Rows( mat ), 1, 10 ); // newlines(10 decimal is 0A hex) every linelength
blob = Matrix To Blob( mat, "int", 2, "little" ); // little endian 16 bit matches utf-16
string = Blob To Char( blob, "utf-16" ); // done!
Show( string );
to get
many characters removed for this picture! Using notepad and a notosans font.