Creating a JSL Script Ecosystem: GIT, Unit Test, VBA for PPT, Crash Log Collection and More (2020-US-45MP-614)
Serkay Ölmez, Sr Staff Data Scientist, Seagate Technology
Fred Zellinger, Sr Staff Engineer, Seagate
With many users and multiple developers, it becomes crucial to manage and source control JSL scripts. This talk outlines how to set up an open source system that integrates JSL scripts with GIT for source control and remote access. The system can also monitor the usage of scripts as well as crash log collection for debugging. Other features such as VBA scripting for PPT generation, unit testing, and user customization are also integrated to create high quality JSL scripts for a wide range of user base.
Auto-generated transcript...
Speaker | Transcript |
Serkay Olmez | Hello this is Serkay and Fred from Seagate and today we are going to talk about how we built a JSL ecosystem to manage our JSL scripts. |
So we are using Git to manage the source controls, the source of our JSL scripts, as well as to distribute them. So that will be the main point of the talk today. | |
But in addition to that, I will be also talking about VBA for PowerPoint integration and also crash log collection, as well as unit tests. I will jump to the outline quickly. | |
So what I want to first talk about is about our history which JSL. I've been scripting in JSL for about 10 years or so. | |
And we are very happy with where we are at now, but it took us quite a bit of time to get here and I want to talk about the milestones of our experience, what we did so far and why we did so. | |
And then we'll be talking about Git and how we enabled Git to source control our JSL scripts and how it enabled us to build further features. For example, once you enable Git, | |
you can add more features such as monitoring your scripts. So you know that the developers can know that their scripts are in use, so they can monitor the usage. | |
And probably more importantly, they can also collect logs. If the script crashes, the developer know and he or she can go back and fix those bugs. And you can... | |
you can add one more feature going one step further. You can even create automated bug tickets you already know that your script crashed. So you can automatically create a bug ticket for that. And you can track those tickets using a tracking software such as JIRA or Atlassian. | |
And I will end up with our best practices and lessons learned so far. | |
So, | |
And in the appendix I also have a manual for a script I will be talking about. It's a script that can push images to PowerPoint. | |
And I have a very detailed manual for that, and I should just let you know that everything I talk about here, the data and the scripts will be available. | |
And they are posted in a public repository in the references section here and you can just go there and grab those files if you choose to do so. So let me start with | |
with the brief history here. So 10...I've been working with JSL for about 10 years or so, and started, I started with | |
very basics and I didn't really know much about JSL scripting. | |
And then what once you start doing that you realize that you have to have some proper source control and you you need proper ways of distributing your scripts. | |
So the thing about JMP scripting is that it has zero barrier to entry. So you can literally do a plot manually and then go grab the script. It's written for you. | |
And you can also use Community to JMP in JMP.com to ask questions and get answers. | |
And you...the scripting's so efficient that you're doing your job so efficiently and people will notice. They will ask you how you do things and you'll say I have a script for that. And they will | |
ask, "Can you share that with me?" And all of a sudden you become a developer, although you didn't intend to do so. | |
And then you have to deal with distributing your scripts. The first idea that comes to mind is just attach them, which is a horrible idea, and I've done that for quite a while. | |
And it's kind of obvious why that's not a good idea, because you attach a script and then you send it out and then the next day you make a | |
revision and then you have to send it again, and you don't even know if the user will go with this next one. So I'm just illustrating the point here. | |
Recently, I got an email from a colleague and he was referring to a script I created in 2017 and I was just numbering my scripts with these | |
version numbers, which which is not a good idea. So it comes back after three, four years and you realize that people are still using three year old script, because they didn't update. | |
One way of solving this problem is to use shared drive and based on the interaction I had with people in in them in the | |
Discovery Summits, is that many people, many companies are using this one. So what what developers do is to | |
dump their scripts into a shared drive and users will be pulling directly from the shared drive, which solves half of the problem that distribution problem, but it doesn't do anything about source controlling. You don't...you cannot trace the changes you did in the code. | |
And that's why we actually moved to Git. And that was a breakthrough for us and enabled lots of features. So what do you do with Git is that developers will push their scripts to a repository and | |
users will be pulling their scripts directly from Git. So that was a big improvement for us and it enabled us to collect crash logs and | |
usage of the scripts, etc. And I just want to talk about a couple more things I learned from the Summits as well. They have been quite useful to improve my scripting skills and I attended a summit in 2018 and I learned quite a bit about expressions, etc. So | |
people may want to go back and listen to those presentations, because they do help with the scripting skills. | |
And one other milestone for us was about the testing. So I was inspired by this talk in the Summit last year, which was about unit testing and that enables you to automatically test your scripts before you publish them. | |
Today I will be mostly talking about integration tests because unit tests are...unit tests are required, but not sufficient. | |
Because you can do a unit test you can test all of your modules and they check out fine. But when you put them together, they won't work. They will crash, as illustrated here. Each drawer is tested, probably, but when you put them together, they won't operate. | |
One nice feature that helps the developer quite a bit is about log collection. | |
It is so helpful to know that your script crashed and, you know, how it crashed. And you can do that by collecting the logs from users and you can go back and fix your script and push the changes, and users will have their fixed script right away. | |
And the final feature we are rolling out rather recently is about automated bug reporting. Since you already have the crash report, why not act upon that information? And so we create an automated | |
monitoring | |
system, which will track those crash logs and it will create bug reports automatically so that the developer can work on those. And on top of it, you can add the user as a watcher so that the user will know that somebody is working on the problem. So that's, that's that solves the | |
that solves the information gap between the user and the developer. | |
So this shows the timeline and I'll be spending some time on the | |
on the individual items, but we'll start with Git and I will build this quickly and let Fred to talk to this. | |
163084 | Okay. |
So, | |
Git is just a version controlling system, and I don't use the word version control. Actually I want to say that Git helps you by giving you, effectively, unlimited redo, even if you use Git on your own local computer only. It | |
gives you the ability to track revisions of your files and go back to old ones, in case you decide that some work you did the last few weeks was incorrect and you want to go back to something from several weeks ago. | |
So Git is just a software that you install on your local computer. It creates a repository, and that repository can then be posted up to other repositories. | |
Such as GitLab or GitHub. TortoiseGit is a GUI interface to get on your local machine that makes things easier. So if, Serkay, if you could open the next slide. | |
So the model that I've tried to get developers to use is on their local computer. The developers on local, have them install the Git client and start | |
version controlling the files on their local PC. Once they get in the habit of doing that, then we can connect them to a remote repository | |
and Git connects to remote repositories or SSH generally or other methods and they can push | |
copies of their Git database up to the remote repository. Once it's up on the remote repository, then it's available for an HTTP web server to share back out and JMP | |
can then point to the URLs and HTTP...HTTP web server and load scripts from it. So instead of using a shared drive, we're using an HTTP web server that was populated by a push to a Git repository. | |
And then the bullet items down below just point out the the benefits of that and how exactly it works. | |
We can go to the next slide. | |
Serkay Olmez | So I will take over here, Fred, and I will show you...show a very basic illustration implementation, and the scripts are available in the References. |
So, | |
this will be a very basic code. And what I'm showing here is the code that the developer is developing. Its its its bare bare minimum, right. It's just a dialog box that says, hello world. | |
And assume the developer wants to pass this code to users. So instead of giving this code, what the developer passes is this code, | |
which is a link to the repository. So this is the repository to the hello JSL script which lives in there... | |
in the remote repository. What the user does is, it just grabs...the user grabs the | |
script from the URL. So this is static code. It doesn't need any change, so developer can change the script whenever he or she wants, but the user doesn't have to do anything. And I will just show you an illustration here and I just want to go to the full screen. | |
Just, just illustrates how this thing works with a particular a GUI, which is get GitHub Desktop. So what you do is you install this | |
software in your computer and create a local repository, and they'll keep track of the changes you you have done. For example, in this case, I just created this hello JSL script | |
and this software will know that there is a change and it will be highlight it automatically. And what you do is, you first committ it to your local repository, | |
which pushes those things into your local repo and then you will be pushing it up to the origin, which is a remote repository. Now I will be pushing it to to the remote which will make it available to the users. | |
So once you do that, now the users will be able to pull this new code. And I'm now switching back to the user role here and this the script, the user runs, and when...you once you do that, the dialog box will show up. So you are running the script that you pulled from the, from the repository. | |
So I will illustrated...illustrate this a little better with a more advanced code, which will be also related to PowerPoint. | |
So people do lots of analysis in JMP, and many people still, at the end of day, want to push their results into PowerPoint. I know JMP has some capabilities to push | |
images directly to PowerPoint, but we wanted a little more...something a little more sophisticated. We want to | |
manage the template of the PowerPoint. We want to do some more stuff within the PowerPoint, decide how many, how many images we want to put per slide, etc. And | |
in order to do that, you first need to connect JMP with PowerPoint and you can do it actually very... | |
in a very good way so you don't even have to leave JMP. So what what you can do is you can locate where your PowerPoint executable file is, and it's typically under Program Files. | |
And then you create a batch file that will trigger this PowerPoint executable, and it will go and grab the PowerPoint file you want to run and it will just run it. So you can do all of this in JMP. Basically what this does is it searches for the PowerPoint executable file. | |
And once it locates it, it bundles it into this batch file. And it also includes the path to the PowerPoint you want to run | |
and the macro and so the PowerPoint will have a macro in it to do the management within the PowerPoint. So this is how you | |
tie JMP to PowerPoint and you're gonna have to even leave the JMP into have to do that. But the question is, how do you get this PowerPoint file to your users to begin with? You could | |
ask him to go and download it, which is not ideal because you want to manage this automatically and at the same time you want to source control your PowerPoint as well. So it needs to be a good candidate; it needs to be a good part of the whole ecosystem. You don't want to put it outside. | |
And one other requirement is that you don't want to download PowerPoint every time the script runs. So you want to download the PowerPoint only if it has changed in that image repository or something new with the PowerPoint. | |
So, and a way of doing this is to use this little JMP script, a JMP command which is creation date. So what I'm doing here is to check | |
the date of the local file. So if the local PowerPoint is older than what I have in the repository, then I will go and download it. So | |
it will go and download the PowerPoint using HTTP request, and it | |
looks something like this. So what you do is you just go check your PowerPoint in your local computer. If it doesn't exist, you can just go and pull it from the repository using HTTP request. If it exists, you like look at its date, and if it is old, you still go and | |
pull the new one. So the bottom line here is that you can integrate PowerPoint seamlessly into JMP environment, so you can push your results into JMP, including tables, images. And you can do all of this without breaking source control and I will show a demonstration here. | |
So this will be the code, for example, you would give to your users. Again, this is static code. It really has nothing except for a URL, which links at the script to the remote repository. So it will go and grab the | |
JMP script from the repository and those scripts are again available and you can just pull them from the References. | |
From the, from the, from the developer side, this is the actual script, right. This is the script, the developer has developed and | |
he or she pushes it to remote repository. And the nice thing about this script is that, I just want to point out a couple of features here quickly. | |
So if, for example, look at this. This is a standalone text script and you don't you don't have to distribute PowerPoint files or additional scripts | |
separately. What you do is you link them in your script and they are linked in the repository here, | |
including the PowerPoint. So this script will manage all the distribution. So, it will go and grab the PowerPoint. | |
It will go and grab other additional files if needed. So everything is bundled in together into the script, so it does all the management for you. So let me show this quickly. I will pull this back | |
and put it into full screen | |
so you can see clearly. So this will be a demonstration of triggering PowerPoint automatically | |
for it for a particular case. And what this script does is it, it gets the paths from the table, this a JMP table that includes lots of image paths, and those are referring to this particular | |
server, and it will take those paths and it will just take the highlighted ones here and it will push them into PowerPoint. | |
I will just run this and this script is available to you if you want to give it a try. And it's a fully functional useful script. So, and I am running this | |
script here. Again, it is referring to the repository. So it doesn't have the row script, it's just pulling it from the repository. You run it, it retrieves the code, runs a dialogue. | |
And I will simply run this. I won't go into the details, and once once you run it, it will just trigger the PowerPoint automatically. | |
And PowerPoint will launch and then it will...it's starting...it's building the slides here. It will pull 20 images and this will take 10 or 20 seconds and then you will have the PowerPoint | |
done. So this is the PowerPoint you get and everything is done automatically and everything was pulled from the repository. | |
So what is next? How would you revise this PowerPoint? So, i assukme you want to make some changes to your template. And | |
you can see more details about this in the appendix, but what I want to show is is how you change the | |
script in the PowerPoint. And I'll go to the full screen, | |
and I will just just run this. What we are starting from is what we what we're left with in the previous slide, right, so you had this these images. | |
What I want to do is to change the template in a trivial way just illustrate the point. So you go to... | |
you go to the macro, you scroll down and find the thing you want to change, and I will be doing a simple change here. I will be changing the header color just to make a point, right. So you change this | |
and | |
save it. And this is this is going real time. | |
You save this and can delete the slides so that it goes faster to repository, because you will be pushing this to the repository. | |
And then you push this out using Git GUI. It will go into the repository | |
and it will make it available to your users. So all of your users will get the modification instantly. So, so you don't you don't have to ask them to go and update the PowerPoint or anything. | |
This is just going to the repository. | |
And I will switch back to JMP and in the user mode and then run the script again, which will retrieve the new PowerPoint. | |
And if you run it again, it will go through the | |
slides, it will create the slide. And what you will notice is that you do have the changes, which was about changing the color of the header line. | |
So, | |
what else do we have? | |
Script monitoring. I think this was one of the best features we developed, because this gives the ability to the developer to see whether | |
their script...his, his script is appreciated or not. Right. So is it run? Is somebody running it on a regular basis? | |
So that's one benefit. The other benefit is about | |
log...crash log collection. So if the script fails, you can capture the failure by using this log capture functionality of JMP. So you sandwich your subroutines into log capture. | |
If it fails...the failing...the log of the failure will be returned into this log returned text. And then what you also do is you | |
enclose it in a try, so that the script still survives to to do the reporting. | |
And then you check whether it's empty. If it's empty, that means there was no crash at all, so your scripts survived. | |
But if not, if it's not empty, that means there was a crash and you contained it. You can you you grab the log and now you can report it. And the way to do it is to use HTTP | |
user ID, the script name, and the lognote, whether it's a crash... | |
whether is was a crash or just a regular, run, maybe some performance metrics. So the bottom line is you can transmit some some some data metadata about your script back to the developer. | |
And what the server will do is, is to log them and you will have a set of files stored in the server and you can monitor them. And I'm just showing a sample | |
table out of these. It's a crash log, which has...which has the date stamp, it has a script name, so this script has failed at the date | |
with this particular crash. So this is extremely useful for the developer because he or she can go back and fix the issue. So I will collapse this and this and I will do a quick quick demonstration here that shows how you | |
how you | |
contain the crash and how you report it to the user. So what I will do here is to make a subroutine crash. I will create an undefined parameter | |
and JMP will complain. It will it will say this thing is not defined, so the script is is crashing. But what I will do is I will call that subroutine | |
in a log capture functionality and everything will be sandwiched in under try. So although the script crashed with the subroutine, it will survive overall, and it will it will be able to report a | |
crash report. So do you give it nice nicely formatted notification to the user and say that. So the scripts crash with this particular error and we created a log for it and we are working on it. So that's, that's the notification you give to them...to the user. | |
So one one key thing I learned in the Summits was about automated testing. So I, I used to do my testings manually, which is which is very frustrating. | |
Because it takes time and you cannot capture each and every corner of your script. It's, it's impossible to do hours of testing when you when you do a small change. | |
So I started getting into the automated testing and it is it's very important to do unit testings. So, you know, testing refers to the testing of individual subroutines. So you have a function and you want to test it | |
multiple times before you put it into into your overall system, so you can do these unit tests | |
for individual modules, but it won't be enough. And I can show you a couple of examples of that. | |
For example, NASA lost their Mars Climate Orbiter for a very strange...because of a very strange error because there were two software teams, one in Europe, | |
one in the US. And one of them was working with the units of pounds and the one in the, yeah that was the one in the US, and the one in Europe | |
was using Newtons. So they they certainly did their unit tests. But when they put together their code, it didn't work, because one was expecting Newtons and the other one was getting pounds. So they they literally lost the | |
Orbiter, just because they they forgot to convert from pounds to Newtons. And more recently as Starliner, Boeing lost their Starliner and then they admitted that they could have caught the error if they had done | |
a rigorous integration test. And at the end of day, what counts is the | |
integration test because modules don't live alone in the script, they talk together. | |
And this is a funny illustration of the problem here. So you can have two objects which are tested thoroughly. It's a window, right, what can go wrong with a window? | |
But if you have two of them put together, they won't even open, so you have to do some integration tests to see if they are combined, if they work okay together. | |
And I want to illustrate a quick point here that shows how I started doing automated testing. And this will refer to a particular case, which is | |
a difficult case, to be fair, because this will be testing of a modal window and modal windows won't go away unless you click on click on a button. So you have to | |
create a JMP script that clicks on a button, that kind of mimics a human behavio,r and what I'm doing here is to is to inject a tester into into the | |
modal window here. And the nice thing about this approach, I think, is that you can still distribute | |
this thing, this script to your users and the users won't even realize that there's actually some testing routine in it. | |
So it has the hooks for a tester, but then you run this alone, what it does is it looks for this test mode parameter and it's not set. | |
So if it's not set the script will set it to zero that that will disable all the hooks in the table. So you can give this to your users and they can run it as this. | |
However, if you want to test your script, what you do is you build a tester code on top of it. So the tester code will set the test mode to one and it will also create the tester... | |
tester object you want to inject into your script. For example, this particular case, what it does is, it's selecting a particular column | |
and then it's assigning it into a role by clicking a button and then it's running the script, right. So that's literally mimicking a human behavior. | |
So then it will load the actual script. So it's basically injecting those parameters into the script that you want to test. Now you're running the script automatically, so it will load the script and run it, and it will close the UI after clicking this button and that button. | |
And then what you do is you run it again in a log capture functionality, so that you know if something goes wrong, you will be capturing the failure. | |
And you also put it in the enter try so that overall, the script survives to report the log. And then if nothing was wrong, you will have empty log return and that means your script did not fail. | |
If something has gone wrong, you will know it in the log capture. So that's the idea. | |
So, | |
We have | |
the ability to monitor the scripts. We have the ability to | |
capture logs of crashes. So we thought why, why don't we act on the crash log? If you see a crash log that means something has gone wrong, and script that...the user already knows that because the script crashed obviously, and the developer knows that the script crashed because the logs are | |
stored in a server. So they are there for you to act on. But the thing is, this is not a closed loop yet, because the user doesn't know that the developer knows. And you can close that loop by creating automated tickets. So since you have the information already, you can | |
use a bug tracking software such as the JIRA Atlassian and then you can use the REST functionality | |
to collect the information from the server, create a ticket, assign it to the developer, and you can also assign the user as a watcher. The watcher means that whenever the developer does anything | |
about the bug and enters that information into the, into JIRA ,that will be looped back to the to the user so he or she will be notified and will know that somebody is working on the bug. | |
And there are multiple ways of doing it, depending on the flavor of JIRA you have, and I am just giving you the | |
basic code here. It's it's a curl code that you can create and you collect the metadata that you have from the crash logs and you embed that into a JSL file and then it will be transmitted through REST and it will go into the | |
JIRA, which is...which will track it for you. And then JIRA will, if you set up properly, JIRA will notify it...JIRA will notify the developer and possibly | |
the user for which...for whom the script crashed. And so this is all tied together and and the developer will know there's a bug that he or she needs to work on and the user will know that somebody will be working on the bug. | |
Okay. I think this takes me to my closing notes. Just the takeaways you can get out of this presentation. | |
Git has been the cornerstone of our system and it has it has enabled us to do lots of nice features and I didn't even mention the basic ones, which | |
which are kind of obvious, because that gives you the ability to collaborate as well. So if you're multiple people working on the same script. You can use all the functionalities of the |