It’s World Statistics Day! To honor the theme of the day, the JMP User Community is having conversations about the importance of trust in statistics and data. And we want to hear from you! Tell us the steps you take to ensure that your data is trustworthy.
The following is a case study of using the JSL Debugger to solve a problem. This is based on a real script with the problem specified. If you would like to see the problem in context, I created a script – much simplified from the original – that shows the problem. If you’d like to “play along,” you can download DebuggerDemo-SolvingProblems.jsl from the JMP File Exchange (free SAS profile required for download).
Although I’m describing the original script, the code snippets and the pictures are all from this sample script. Follow along!
The other day, a JSL scripter who had a problem with a script was directed to me for help. Her script ran fine in English, but a tester running her script in Simplified Chinese got errors.
The problem seemed to be with the way strings were being handled in English versus Chinese: Specifically, escaped quotes (\!") were being left out.
In English, here is a sample string that was constructed:
Note the missing \!"...\!" around South, which caused errors when the string was parsed and that unquoted value couldn’t be resolved.
Was there was a bug in JMP when handling quotes within strings in languages other than English?
She sent me the script and the data needed to run it. The script was just short of 900 lines and produced an interactive report. The first 130 lines were setup, lines 131-886 defined expressions, and then the rest of the script used those expressions.
The first thing I did was run the script in English and then in Chinese, to see if I had the same problem. I did. Then I ran it in French, which confirmed that the problem was with any non-English language, not just Chinese. I ran the rest of my tests in English and French, because French is easier for me to read than Chinese.
TIP: It's easy to switch the language you're running JMP in on Windows. Select File > Preferences, and select the Windows Specific category. At the top, select a language under Display Language. On Macintosh, quit JMP first, and then open System Preferences (on the Apple menu), and select the Language & Text category. Drag a new language to the top of the list. Don't close this window! It sets the language for your computer. Start JMP, and it will run in the first language in the the list. When you're finished, quit JMP and drag English (or whatever language you typically use on your computer) to the top. Now it's safe to close the System Preferences window.
There were several lines that constructed the string above, and they all relied on fairly complex coding that would have been hard and time-consuming to execute line-by-line, entering and exiting multiple expressions, and understanding what I was looking at.
Running just the appropriate string-building lines (after the script had been run once to initialize all those variables) gave the same results in all languages, with the escaped quotes being used correctly. However, the lines that formed the string used values from two data tables plus a third that was constructed from the first two, and it was very hard to tell at a glance where all the variables being used were created and populated.
In other words, this was a job for the JSL Debugger!
Enter the JSL Debugger
Running JMP in English, I opened the script and clicked Debug. Since I’d already run the script, all its variables were listed under Globals. And there were a lot of them – too many to track as I moved through the script. (I could also have discovered this by running Show Globals() and watching the log grow very long.)
That was all I needed to know right then, so I quit the JSL Debugger and quit and restarted JMP so that my environment was now completely clean. I opened the script, and started the debugger again.
I was interested only in the string that was being built, so I added that variable to my Watch list. To do so, I selected the variable in the script as shown in the JSL Debugger, right-clicked and selected Add Watch.
I added a breakpoint on the first line after the setup portion was finished (Line 131 in the original; Line 34 in my sample), and another one at the first “real” instruction line after all the expressions were created (Line 887 in the original; this doesn’t exist in my sample). To do so, I clicked next to the line number in the script shown in the debugger.
TIP: A breakpoint will only work if you place it at a line that performs some action. A breakpoint at a blank line, a line that only contains a closed parenthesis, or a line that's commented out will be ignored.
I ran the first “setup” portion and saw that my string variable had not yet been created. I learned that with one click – much faster than I would have running the script manually line by line!
I ran the second portion, which simply created all the expressions.
Now it was time for action.
The “meat” of the script was contained at the bottom of the original script, in only 10 lines. The first part was a loop that called a couple of expressions (which called expressions, which called expressions, etc.). I used the Step Into button to step into each expression, to find out what was happening.
TIP: Using Step Over when an expression, user-defined function, or Include() are called, runs that code in its entirety, returns its result, and places the point of execution at the next line. If you don't need to step through line-by-line, this is a fast way of skipping what you don't need. On the other hand, sometimes you do need to see exactly what that expression, function, or included script is doing. Use Step Into instead, and you will be able step through that piece of code line-by-line.
What I discovered was that each piece of the string was built within an If() expression: If the column with the information was character, then the value was quoted with the escaped quotes; otherwise, the value was not quoted.
This also meant that, out of almost-900 lines of code, I was really only interested in a small chunk of 19 lines (which nevertheless depended on the other 800+). These lines are re-created (and simplified) in Lines 38 – 54 in my sample.
I changed JMP to run in French, quit and restarted, and went through the debugger again.
Suddenly, all was all clear...
JMP is Localized! (Who Knew?)
The problem was not with the assignments that built the string. The problem I saw immediately with the debugger is that in French, the condition within the If() expression that determined if the column was character determined that the column was not character. For character columns, the script executed a different code point in French than it did in English.
In fact, the condition in French would always return 0. The condition piece of the If() expression is this (Lines 40 and 48 in my sample):
Parse("dt:"|| ColumnNames[thisCol])<< Get Data Type()=="Character"
In English, col<<Get Data Type() returns "Character" for a character column. In French, it returns "Caractère", which will never evaluate to being equal to "Character".
col<<Get Data Type() always returns a localized answer.
Since the column in question was never determined to be equal to "Character" when run in Chinese (or French or any language other than English that JMP supports), the string was constructed as if the cell value was numeric. And that was why the string in English had escaped quotes and the string in Chinese didn’t.
Note: Although the JSL Debugger doesn’t show the results of the <<Get Data Type() message – only the result of the comparison – I could easily quit the debugger, and select and run Parse("dt:"|| ColumnNames[thisCol])<< Get Data Type() to see what was actually returned. The debugger allowed me to find the problem quickly and painlessly.
Call to Action
There is no bug in JMP: This is working as designed. The script needs only a small change to work in any language.
There is more than one approach to make the script friendly across languages. What would you do? I’ll add the approach we took to the comments in a week or two.