Choose Language Hide Translation Bar

The Morning Update: Creating an Automated Daily Report to Viewers Using Internet-Based Data (2021-EU-45MP-728)

Brian Corcoran, JMP Director of Research and Development, SAS

 

JMP Live is a powerful new collaboration tool. But it is only as useful as the quality of the content that you provide to it. This talk discusses the development of a JMP JSL script to acquire data through the internet via a REST API. It then will show how to publish an initial report to JMP Live and automatically update the data within that same report on a daily basis. In this fashion you can provide automated reporting to your viewers who just want to see the latest data when they start work in the morning.

 

 

Auto-generated transcript...

 

Speaker

Transcript

Brian Corcoran Welcome to the morning update. This is my talk for JMP Discovery Europe 2021.
My name is Brian Corcoran and I am a JMP development manager.
So what are we hoping to do today?
I would like to show you how to create a report in JMP based on an internet based data provider using a REST protocol.
Once we do that I'm going to introduce you to how we could publish this report to JMP Live using the updated JMP scripting engine that we've put into JMP 16.
Finally, I'm going to show you how you can automate this test, so that every day, when you come into work, reports have already been updated for you and you can just view it with your morning coffee or tea.
Okay, so first let's talk about internet data providers.
Most of them are based on something called a REST protocol.
It's a stateless call, essentially it looks like a URL with some parameters tagged on to the end, and an increasing number of organizations are using it to expose their public data to end users. Some examples are the World Bank, US Census, Google.
So JMP has a facility to help you with this called HTTP Request. It will allow you to access these services. Typically they use something called a GET or POST verb to get to these. HTTP Requests will allow you to use those.
For this particular report I'm going to use the Johns Hopkins COVID
REST API with some data from the pandemic.
So Johns Hopkins is the university United States that aggregates this data from all over the world, and then provides this free public API to access.
Now there is also a premium version of this, and that gives you better access and more granularity with the data, but we're going to try to get by with the free version, for now, and hopefully you can take some of the scripts that I give you and try them out yourself.
So what does the REST API look like? Well,
let's take a look.
Here's an example from Johns Hopkins.
It starts out with this base URL, which in this case is API, the COVID19API.com. And here, you can kind of look at the URL and say, hey, we're asking for the total confirmed cases for a country.
Where you see this bracketed country, you have to actually insert the name. In my case, I'm going to use Germany (they use the English names),
but there are probably 100 countries where you could try this out.
At Johns Hopkins requires you to supply a starting and ending date
after this base URL where you see the question mark. And those are essentially parameters to this API call you're making and that will allow us to return
values within the date range that we specify.
And it's kind of this long format. It's
the month...the year, month and day, T for time and then the 24 hour time with a Z appended to it.
So fortunately JMP has facilities to help you with that. You can use a format call, and today we'll give you the exact time for right now, along with the date, and we specify the format string that we want to use, and then we can just append a Z to the end to get
the format we need for Johns Hopkins.
So, like I mentioned,
REST calls typically use either a GET or POST post verb. Johns Hopkins uses a GET; I'm going to jump out of PowerPoint for a minute to show that.
If you go to that website that I had in there, in the
slide or also in the paper,
you'll see that it provides the APIs by type and showing you essentially how you pass the information and what you expect to get back here.
Alright, so you can kind of go through here, see what each one requires, and you can see, there are premium categories, where you have to pay so much per month to access that.
Okay.
So.
What do you get back? Well, you get back JSON, which is just a bunch of strings in name value pairs (for instance cases, colon, and then a numeric string)
that's going to represent the number of cases for this particular observation, alright. And JMP has nice facilities to access JSON and we'll show you that in a minute.
Here is our actual HTTP Request call. We're just going to pass in our URL, the method we want, which is GET and then this secure zero. Why do we do that?
Well, Johns Hopkins is a public API and it does not want to use secure socket layer, or ssl, so we need to turn that off or the call will fail.
And then we just make our call with the send command and our JSON will be returned in the data.
So let's drill down a little bit into a script. I'm going to get out of PowerPoint and we're going to bring up JMP.
I'm using JMP Pro 16, but this will work with regular JMP as well.
And I'll mention that this script is included with the conference materials, along with the paper that we'll be looking at.
Okay, so.
What am I doing here to set up? First of all, I'm going to say I'm using my...the documents folder for where I'm going to store my data, and I'm going to store it in a table named covid19_de.jmp.
Later on I'm going to generate a report and I'm going to make sure it always has this name. It's very important, in this case, that it's a standardized name and I'll show you why later.
Finally, I'm just converting my document's path in my name and my file to a full path to use.
All right, we're not going to worry about this date formatting function here, and we're going to go into the meat of how we acquire our data.
Right, so we're going to use a pattern here, and that is,
we're going to assume that we've never run the script before. Now,
if we do not find a file where we have previously accumulated data, then we will create a data table and fill it in with values. However, if we already find a data table, then we will just update the table with what the latest days worth of data, just one value, all right.
And that way we don't have to worry about whether we've run this before or not. We can just run this script kind of blindly, you can give it to somebody else. It'll work for them.
Alright, so here
we're going to say if our file exists, our data table in the documents folder, just go ahead and open it and, by the way, go ahead and set this flag to say we have data already.
Otherwise I'm going to create a data table and I'm going to create it with a date, column, cases, and daily change.
Right.
Now the next part is we're going to format our strings for the call to Johns Hopkins. Remember we needed to have a from and to range. Alright, so here's the string we already looked at. This is for the today value.
Alright, so in the case where I only need the value from...
the current day's value, I'm just updating my data.
I'm going to go from yesterday to today essentially. Alright, so I'm going to create yesterday by saying today minus two days and I'm going to go to today.
I do two days, because depending on the time and when the data gets updated at Johns Hopkins, sometimes you get one value, sometimes you get two.
When I...if I get two, I will just take the most recent value, but I want to make sure I get something.
Alright, the next thing is if I've never gotten data, I want to have a start date. Now I'm arbitrarily going to start on September 1 of 2020; you could put whatever you wanted. Now here you see I'm actually using August 31, that's because Johns Hopkins does not actually give us
a value for the change between days, so they will only give you the total cumulative cases for pandemic data. So in order to calculate the change, I have to subtract
today's value from yesterday's value.
Well, if I want to start in September 1 then I need the August 31 data in order to compute the change for September 1, so that's why we do that.
If you pay for the premium API at Johns Hopkins, you can
get the change value.
Alright, so here's our URL that we discussed earlier.
Alright, so this is important here. If we have data that our URL is just going to start from yesterday, but we don't, and we're using this if statement, if we don't, then we're going to start from September 1, our start date.
And then we're just going to go to today.
I show this URL just for debugging purposes, but then this is where we actually do our request and send call.
Right.
I put in a little wait to make sure that it has a chance to run.
And here, is where we get our JSON data back, and this is where JMP has a really handy facility for this.
Parse JSON will take this big block of strings and break it into an array of name value pairs. You can then call in items on that array to find out how many pieces of data you've gotten,
and you can reference that data as an array with array subscripts.
Okay.
So now let's navigate down a little bit.
Here we're going to fill in our data table. If we already have data,
then we're just going to add one row to the table at the end,
and we're going to fill in that data value, along with our change, which we compute from today's value versus yesterday's value, all right.
Then we just save it off.
Now.
If we've never created it before, then we're going to add the number of rows we have, minus one because there's a header information and then we're going to cycle through this and calculate all our daily changes,
date values and
put the case data in the table. And I think I'm going to demonstrate, hopefully, running this from scratch right now, we're at like 163 days or something like that.
And then we will save out that table to the documents folder.
Okay, now that we have the data, we can think about publishing to JMP Live.
But this is probably a good chance for me to describe the...how you do JSL programing JMP Live and how we've changed it in JMP Live 16 and JMP 16.
Let me bring up the paper associated with this talk and you'll have this in the Community as well.
Alright, so in JMP 16, we rewrote the scripting to be, hopefully, more powerful but easier to use, and the scripting revolves around the idea of having a connection or managed connection information stored away.
So.
What happens is that...let's bring up JMP again.
I'll show you what that means. You go to file, publish, manage connections.
All right.
And we can add one.
Here, you would specify a connection name of your choice, and then the URL, which JMP Live is essentially a REST service itself, where your JMP Live site is. If your administrator requires in a secret API key to enable scripting, you would need to supply it here.
When you do this and you hit the next button, you're going to be prompted, most likely depending on your authentication mechanism for credentials.
When you enter those, it will essentially give you an access token, which means that it stores it away on disk for you, not your credentials, but just this access token that allows you to access this site and script to it.
That way, you can, without having to provide any of this information in the script, you can just reference this connection name that you supply. And I'll show you one, for instance, this is JMP Live Daily, which is what I'm going to use. Here's my URL on point, my API key.
I can just reference JMP Live Daily, and it will know how to connect within my script.
Okay.
So, to create the connection then, I just say new JMP Live and the name of my connection. Now here I'm saying, let's prompt if we need to. What does that mean? Well if, for some reason
your credentials, you know, expire or your access token is old, then it will prompt you to enter your credentials
once the script starts. If you don't supply this and your credentials have expired, then the script will just fail.
Okay. So how do we actually publish a report to JMP Live? Alright, so here's an example of just a simple bivariate that you might run out of Big Class, all right.
So, to make that a published report, you're going to just say, create a new web report and assign it to a variable.
And then I'm going to take this bivariate reference and I'm going to say, add that report to the web report.
And I'm going to optionally provide a title and a description.
And then I'm just going to call publish and the publish will return a result.
Okay, and we might publish up to JMP Live and look like something like that.
If we want, we, the result will tell us if we succeeded, and since we're actually the public...publication is actually like an HTTP call, we can look at the status, if we so desire, or an error message. Okay.
All right.
The other interesting thing is
if the result of our operation is a...like adding a report or a folder to JMP Live that...the result we get back contains information that allows us to further manipulate that item.
We can call this As Scriptable to use that information to create an object within scripting, like a report, that we can then access. For instance, after I've done this, I can use the report and set my report title. It will go up to JMP Live and change the report title.
Okay.
The whole idea within JMP Live now is around the idea of manipulating reports and folders, searching folders, searching for reports, things like that.
The report understands
that it has an identifier. And this identifier, if you were to look at it, is a long alphanumeric string that really would be awkward enter into a script or remember.
But it's also...it's required for you to, like, uniquely identify that report. And the reason you might want to uniquely identify it, let's suppose you want to delete it.
You tell JMP Live that you want to delete the report and then the report can...you can use the Get ID on that report object to to supply that unique ID so JMP Live knows which one to remove.
I'm showing you these particular items because there'll be important in our
ultimate script that we hope to produce.
The other area where it can be really important to have...know that ID is in searching.
Now here, I can find reports. For instance, let's suppose I want to find all the bivariate reports that start with Biv, I can just ask JMP Live to find reports and return a list of results. Then I can make...actually turn that list into a list of reports.
There's a function called Get Number of Items on that report list that allows me to cycle through each one by subscript.
And then, if I so desire to, like, I could delete all of them if I wanted to, alright. So the search capability is new and we hope fairly powerful for you to use to do large operations on a JMP Live site.
Right so there's one operation that we need to address too, before we're really ready to show our script off a little bit further.
And that is update data.
So in JMP Live 16,
we've added the capability to update just the data for a report without having to publish the entire report back up to JMP Live.
Let's suppose you get a report just the way you want it, and you know, maybe it's a little customized and you like the appearance and you don't want to mess with it.
But you do want to update the data and have it recalculated. Well, now you can pass just the data table for that report up to JMP Live and, also reduce, you know, the transmission time,
and you do that by calling Update Data, providing the report ID, and then just the data table with the updated data.
The report will recalculate on JMP Live, rather than having to do it on your desktop,
and anybody who happens to be viewing that report will also see the update.
Okay, so now we kind of have all the tools that we need to actually do our script, so let's go take a look.
All right.
So,
here's where I create my JMP Live connection.
And now I'm going to create a control chart. Now a control chart really is not the ideal
analysis platform for this data, so why am I using it? Well, two reasons. One, it is nice to see day to day changes,
and two, it allows me to plug or advertise the fact that we have another new feature in JMP Live 16, and that is to show control chart warnings.
If you publish control charts and there are observations that are out of bounds
that would generate a warning on the desktop, well, when you publish it to JMP Live or update the data,
JMP Live can also generate warnings to send to anybody who's subscribed to that report and wants to get an email or a notification within the website that something is out of bounds. This can be really useful for things like process control.
So my colleagues
are doing a talk on control chart warnings and I encourage you to also check that out if you have a chance.
How did I generate this? Well, I just went to JMP and, you know, with some older data and, for instance, you know there's a facility within JMP if you're doing an analysis, where you can just say save script to script window.
I just took that information and plugged it into this script, so that's pretty handy.
Okay.
So let's get into the meat of how we're going to publish a report, and I promise you, we will
run this in a little while.
Okay, so.
Once again I'm going to have a pattern here. I'm going to look for the report to see if it is already up on JMP Live.
If it is not, I will publish it, but if it is already up there, then I will just take the updated data table that we created earlier
and I will provide that to update the data on the server side and have it recalculate the report if it sees fit.
All right, so how do I do that? First of all, I'm going to search for our report (and remember we use a standardized report name that I had specified earlier so we're always looking for the same one).
And I'm also going to say only published by me, just in case somebody else had a report of the same name. I wouldn't want to get that.
All right, I'll turn it into a list that I can look at. And if the number of
reports is zero, that means I didn't find a previously published report.
So I'm going to create a new web report, add my control chart builder output to it, and publish it. And I want to make sure that it's available for everybody by saying Public(1).
If I did find one, I'm going to take that report, referencing the first item returned, and I'm just going to update the data here, using our updated tables that we generated previously.
The rest of this is just debugging information that I showed in the log just to see if everything went alright, but it's not really necessary.
Finally, at the end here, I have a Quit statement.
When we actually go to automate this later,
this is important because we want JMP to shut down and close down all the windows. Otherwise, the next time we go to run it, it might take a look and see JMP's already running and think that things are hung from a previous operation.
However, for interactive operation, I'm going to comment this out right now.
Okay.
So I
think we're ready to go here. We can give this a try and we'll hope for the best. Sometimes Johns Hopkins gets very busy, and will actually reject the request to get the data which would be unfortunate, but
let's try this out.
And just to show you, in the documents folder at this point, I do not have a JMP table with the name that I'm specifying, and if I go to
the JMP Live site that I hope to publish to, we don't see any
output from control chart builder there.
Alright, so let's give this a try.
Right, there's our control chart.
I'm going to refresh JMP Live.
Okay, there's our control chart builder output. This one did have warnings to it.
If we look at this within JMP Live,
we can hover over points and see what the most recent data is. This is for February 10; I'm on the 11th so we have up-to-date data. This is some data that is considered out of control, based on the moving average from back in January and late December.
Right, so far, so good.
If I
open up my documents folder and refresh that,
we can see that our JMP table has been created.
All right.
So, and we see now this has been published a minute ago.
Alright, so let's go ahead and shut this down.
And we're going to...actually, I didn't want to do that...hold on a second.
We're going to cheat a little here.
Let's go ahead and we're going to delete the last value that we got, right. Then we're going to save that
and close it down, alright. So we're going to simulate the fact that we have not run it today yet, all right, and then we're going to run this again.
Okay, just fetch the last value.
And we go up to our website.
And we see it just regenerated a few seconds ago again. So in this case, we just updated the data and we just got the last value.
If I were to
bring up
my mail, I happen to be subscribed for warnings and hopefully,
might see a little update here too. We're getting notifications that
there is a publication in this control chart builder and there were warnings, and if I want, I can go and see where those failed and what points are out of bounds.
Okay.
Alright, so I think we're in good shape for trying the automated task. So I'm going to go ahead and I'm going to delete this post.
Right.
Let's shut this down.
Shut this down.
I am going to put our quit back in there, because now we're going to need that, for one way run in an automated fashion.
And I will close this.
Go to the documents folder and I'm going to delete
our data and pretend that we're running this from scratch.
Right.
And let's make sure JMP is shut down.
Okay.
Now.
If you've seen some of my previous Discovery talks, you may have seen me use the task scheduler before. It's a popular topic with me.
You just type in task scheduler here on Windows; I hope you saw that.
On the Mac, you would have to use automator or a chron job. I would suggest automator.
All right, but the task scheduler allows you to run just about anything on a regular basis.
So let's go ahead and
create a new task. We'll just create task here and we'll name it
COVID Data for Germany.
I want to run with high privileges.
I'm going to run only when the user is logged in because I don't want to enter credential data, but I would suggest
selecting run when the user is logged on or not, if you're doing this for a production purpose,
because if your machine gets rebooted due to a windows update or some other reason, you want it to still run and this will allow you to do that if you specify that option. It will require you to enter your credentials and when you finally save out this task.
Alright.
So for triggers, what is that? That's when I want it to run, so let's go ahead and do that.
And let's say we want to run it daily starting tomorrow.
And maybe I just want to run it at six o'clock a.m. before I get in in the morning,
whatever get in means anymore.
Before I roll out of bed and go to work. All right.
So I'm going to stop this task if it runs longer than 30 minutes because that probably means it's hung.
And otherwise I think we're good to go there.
So what action do we want to perform? Well, we want to run JMP, so you have to navigate where jmp.exe is installed, which is in program files, SAS, and either JMP or JMP Pro 16. Go ahead and select that.
And then, our argument is our JSL script which, unfortunately, you have to enter in manually here,
which I'll do.
But just make sure that you're careful with that.
Okay.
Now, under settings I'm going to make sure that we have always allow. It has to be run on demand, because that'll allow us to try it right now and make sure it works right.
And if there's already one running, make sure to stop it. That probably means it's hung and stop the task if runs longer than an hour again, just in case it hangs.
Alright, so there's our task to run every day. So we can debug it essentially by trying it out right now, since we allowed it to be run at any time. Let's go ahead, right click the mouse button and say run.
Hopefully we'll see the taskbar. JMP will briefly come up, run and then go away.
Looking down here, hopefully, things are happening.
Okay, and then it's gone.
Let's take a look at our website; it will refresh that.
And there is our report generated a few seconds ago.
If we look at our folder, we can see that our JMP table's been generated and hopefully tomorrow morning at 6 a.m.,
our task will run and get us a fresh batch of data and an updated report. And when we come in with our coffee or tea, we can take a look at that and make our decisions for the day.
So that concludes my talk. I hope one of the three aspects we've discussed today,
internet based data acquisition,
JMP Live scripting, or automated task generation, has helped you with your job.
Thank you for attending and I hope you enjoy the rest of the conference. Bye.