2023-EU-30MP-1194 - Automatic Data Refresh in JMP Live 17.mp4
My name is Brian Corcoran, and welcome to Automatic Refresh of Data in JMP Live 17. I'm a JMP Development Manager, and my group is responsible for JMP Live.
What is JMP Live? For those of you who may not know , it may be worth giving you a little introduction. JMP Live is a web- based collaboration site, so users can publish reports from JMP, their desktop version of JMP, and the data to JMP Live. Users at JMP Live can interact with those reports, and if they have JMP, they can also download them to work on those reports with their desktop copy of JMP. A copy of JMP is not necessary, though, to use the JMP Live site.
Now, in JMP Live 15 and 16, we required users to publish new content from the desktop application in order to update things. A common request that we got is that it'd be nice if the server could do this for me when I'm not working or whatever, and I could just look at an updated copy at my leisure.
That forced us to revisit how we treat data in JMP Live 17. JMP Live 17 really represents a major rewrite of the product. We made data an equal to a report. Before that, it was along for the ride hidden, and you really wouldn't see it being transmitted up to JMP Live.
Now you can publish data independently. You can look at it on the JMP Live site. You can update just the data in any reports that use it, and they can share that data will all automatically be recreated with that new data. This work was all done to provide the foundation for refreshable data.
Here, the contents of the data post on JMP Live are refreshed on the server side, and there's no intervention by a user in the JMP desktop client. Usually, data of this nature is in a database or some REST- based web endpoint or something like that. It has to be data that is accessible from the server where you have JMP Live installed.
JMP Live provides us, we hope, easy to use scheduler, so you can put in a repeatable, hands- free refresher of your data. This fulfills the dream of, I go home at night and the data is refreshed, the reports are regenerated. When I come in the morning and I'm drinking my tea or coffee or whatever, I can look at the updated report and make decisions based on that new data.
I'm going to provide a variety of scenarios on how you can learn to do data you fresh with JMP Live. Let me first shut down PowerPoint, and I'll bring up a copy of JMP Pro 17, but this will work on regular JMP equally well.
All right, so first , I'm going to start with a really simple explanation of how we separate reports and data in JMP Live 17. I think it's important to understand that before we proceed to the more complicated examples.
I'm just going to bring up a sample data set that we ship with JMP, the financial data. It's just data for 500 companies across a variety of industries, and you can see them organized by type. It's basically a sales profitability data number of employees, things like that.
Let's suppose that I create a simple bivariate plot of sales by number of employees . There's not a lot to it, but I can hover over the points and see the data. Let's suppose I want to publish that to JMP Live.
I'm wanting to go here, and I'm going to say publish report to JMP Live. I've set up a connection to a server that's internal to SaaS and that has access to SaaS resources here. I'm going to publish this as a new report.
I've set up a folder called Discovery to publish this report into. We'll just go ahead and do that. The first time it makes the connection, it can take a little bit longer, but there we go. Let's go ahead and bring up this JMP Live release. I'm going to go to my space that I just published to, Brian Corcoran, in my Discovery folder.
There is my report. If I open it up, you can see I can hover over it just like I did in JMP. But I'm looking at this, and it's a boring report. There's not a lot to it. Maybe I should have included some more information.
Well, let's go back to our report. There we go. Let's suppose I want to add a fit mean and a fit line to that. I also want to supply a local data filter. I'm going to make a filter by industry type so that I can cycle through each industry and look at the individual companies involved. If I get the drug pharmaceutical companies, I can hover over the one with the most sales, and we see it's 9.8 billion.
Now, let's suppose that I want to update this report. But in the meantime, maybe I've got information that says, "Hey, this is not a $9 billion company. It's a $19 billion company, but I'm waiting for verification on that. I don't want to publish the data with this, but I really would like to update the contents of my graphic. Well, we can still do that.
I'm going to go ahead and publish the report. But this time, we're going to do a replace operation. We'll select replace an existing report down here. It's going to ask us, what report do you want to replace? It's going to give us the candidates like the most recently accessed, and so there's our financial report . I'm going to say next there.
Here it says, "What do you want to do about the data? " I'm going to select, Use the data that's already up on JMP Live. I'll say replace the report. It goes ahead and does that.
If I go up to my site, I can see that it did indeed add my fit lines and means and my data filter. I can manipulate these like we do in JMP, but you'll see that my outlier company still is a $9 billion company. All right, so we did not update the data.
Now, let's suppose I've shut this report down, but I do get information that the sales are indeed $19 billion for drug company number one. I can choose to publish just the data. I'll say, Update existing data. Once again, I'll select my financial post, and I want to replace that.
Now you see that it's automatically reloading the data . It's going to recalculate all of our statistics down here as well with new fit lines and that. Now we can see that our outlier is represented here as a $ 19 billion company. If I wanted to, I could even bring up the data with the data viewer that I mentioned earlier , allowing us to explore that, and there is our update to our drug data for company number one.
All right, so that is the separation of reports and data, and that provides the foundation for our data refresh. Let's go ahead and get into a real refresh example. Let's minimize this for a minute, and do a little cleanup because otherwise, we will get confused where we're at.
Now, my next example will be a simple data refresh, and it allows me to introduce another feature that's new to JMP 17, and that is access to the OSI Pi Historian database. If you're not familiar with historian databases, they're often used to collect lots of process data, maybe from manufacturing lines, lots of different machines and devices, putting real -time data into a database. Then you can look at this historian database at your leisure to analyze trends and see where there are problems.
We have a historian database with sample data here at SaaS, and I'm going to select that and get some information out of that. Here is our connection to the PI server. I'll open the sample data area.
What we have here is a simulated data center where we have lots and lots of racks of computer servers. Essentially, we're looking at power consumption on all of those to see where we're spending a lot of money and things like that. All of these represent a table, and we can import all of these at once if we wanted to.
I'm just going to import the data for power meter number one here. This is old data. I'm going to go back in time on it. I'm going to ask for 5,000 points. I'll take just a second or two to import, but we'll start that up. There's our data.
I would call your attention right here to this source script. This is important. If we edit this, we'll see that it contains information to recreate this data fetch, including the location of our P I server, the actual table here that we want to import, how many points we want. This will be useful.
Let's go ahead, though, and just create a simple run chart and control chart. I can select that, and we're just going to do the values. All right, so there it is. Let's go ahead and publish that to our JMP Live server.
All right, I'm going to publish that as a new report back into our Discovery folder. I'm just going to publish it as is. Close that, and we'll bring up our browser. We are going to go to a different session. Hold on. We are going to look at my Brian Corcoran space again.
In the Discovery folder, we'll find our Atlanta data center. Now, we can open in that up and see the points and all that. We know how that works.
I'm going to call your attention to these Files tab, though. Here we have our financial report. But here's our Atlanta data center report. This is the report, but this is the data table, so let's click on this.
There's the report that's based on the data if there are multiple reports that I'll show here. Here's the settings, and here's where it gets interesting. There is our source script that we had down in our table in JMP.
Here's something called a refresh script. Let's just concentrate on these two panes now. The source script has been uploaded with the data, and it provides us a basis for how we could recreate this data. A refresh script is a piece of JSL that is going to supply data when a report records it.
There's one big rule for a refresh script, and that is that the last operation that it performs must be to return a data table. Essentially, data refreshes are done through JSL scripts. Let's enable this refreshable button. Let's copy this script as the basis for our refresh script. I'm just going to paste this in here.
If you remember earlier when we were looking at the P I dialog, I said you could import all kinds of tables at once . Because of that, OSI PI import JSL returns a list of data tables, not a single data table, but a list. Our rule for a refresh script is it must return a single data table. We need to get to that point.
I'm going to assign this output of this refresh script to a list variable that I just arbitrarily named DT list. I'm going to put a semi colon on this run statement. Now I am going to assign the first element in that list, which is the only table that we have and the only table we care about to a variable named DT. Since that's the last operation in the script, that's what's going to be returned here.
While we're at it, though, why don't we go ahead and we're going to change it to return 10,000 points. I'll save that script out. Now, let's go ahead and try this out. Here's a button we can manually refresh our data server side. I'll say yes, we know what we're doing. We want to do this.
All right, I said it was done three minutes ago, and then it changes to a few seconds ago. Let's look at our report. Here's our report. Let's look at it. It looks a little different than the one we had back in JMP because now we have 10,000 points of data. We've done all of this on the JMP Live server, not on the desktop client.
Now we could recreate that on the server , if we want, without ever having to involve our client. Let's go ahead . That's our first example of data refresh. I'm going to clean this up, so we don't get confused with our other work.
A common operation where you'd want to do a data refresh is a fetch of data from a database. One of the big ones is Postgres, and I'm going to show an example of fetching from Postgres. Before I do that, I'm going to go ahead I'm going to change the JMP Live server that I'm accessing. This one is actually outside of SaaS resources. I can go to manage connections here.
You can actually look at this yourself. It's devlive 17.j mp.c om. If you go out there, you'll be able to see these reports. For this one, I'm going to bring up Query Builder. On Amazon Web Services, I have created a Postgres database full of sample data. I'm going to open that up.
I have some stock quotes for Apple computers that I just put up there for demonstration purposes. Let's go ahead and open that up. I'm going to just build a query quickly. I only have date and quote data.
This is essentially the closing quote of the stock price for the end of the day, and it starts on the January 1st of last year. Let's go ahead and just run that query. You see what I'm doing here. Let's take this a little bit further. I'm going to shut this down.
Now, another new feature of JMP 17 is something called the Workflow Builder. I'd like to integrate the Workflow Builder into this demonstration tool. Let's do that. I'm going to go over here , and I'm going to say New workflow.
It's going to capture my actions as I do them. I'm going to start recording that workflow. Let's go ahead and we're going to run our query again, this time with the workflow recording it.
It'll capture that query. You see here, if we look at our source script again, once again, this has information on how to connect to that Postgres database, the table that we want to look at here, and what the query actually is.
Now, let's suppose after the fact that... I don't want that. Let's see. Let's suppose after the fact that we decide we want to do some manipulation on this table. Maybe I want to do a subset of the table based on a where clause.
Now, I probably could have done this in Query Builder, but I thought of this after the fact. Let's do row selection. I'm going to select where, and I'm going to select where the date is greater than or equal to June 1st of last year. I'm going to make sure that it's selected. There it is.
Then I'm going to do a subset of that table just using the selected rows. Then maybe after that fact, and you'll notice that our Workflow Builder seems to be accumulating this information, I'm going to go , and I'm going to color by column . I'm going to color by this start quote.
This is the data to the point that I feel like I want to do my analysis. Let's stop the Workflow Builder from recording that. Let's go ahead and create a graphic on that. I'm just going to do the quotes by the dates. We see when the stock was high back here in the summer of 2022, and it's gone down considerably.
Let's go ahead and publish that. Again, this is going to a new server this time. The first connection can take a little bit of time sometimes.
All right, let's publish new. This time, I've set up a space called Discovery Year of 2023 with a folder named Automatic Data refresh. You can look at this at your leisure and see this yourself. I'm going to name this Apple Stock Quotes Since June. I'm going to go ahead and publish that.
Let's go ahead and look at this server. All right, and this one's in dark mode. Make you realize you're on something different. Here's our report. Let's go ahead, though, and look at that space, and I can search on Discovery to see which one I want to look at. Here's Europe.
There's my folder, Automatic Data Refresh in JMP Live 17 in our report. L et's go ahead and look at the files, though. There's our Apple quotes, and there's no source script. What's up with that? Well, let's go back and look at our data here.
We look at the source script here . We see that subset operation just picked out individual rows that were selected. I t couldn't go far enough back to understand that this came from a previous database fetch. It just knows it has this table that had been unsaved at that time, and it was picking rows out of it. That's not going to be helpful for us.
We have our workflow and I've stopped that. Let's go ahead and say, what happens if I say, save script to script window of our workflow? There it has captured our query along with our subset operation and our coloring operation. Let's go ahead and we're going to copy this and use this as a basis for our refresh script.
All right, so let's make this refreshable. We'll go ahead and edit. All right. This does require a little bit of change. First of all, I'm going to return my information to data tables. This is my query data table, the original full query.
I don't really need to, but I'm going to assign that to a variable here because it provides clarity for me. Here's our where clause selection . This is what I really want to capture this subset operation.
I'm going to say this subset, and I'm going to put brackets around here. The reason is this. The way Workflow Builder builds this is it cascades or chains together operations, and we have a select where here. It's going to take the most recent operation and assign it to my variable.
I don't want the selection to go into this subset, I want the subset operation. I put a bracket around this part to make it just one object that's referred to by subset. It'll be the subset operation operation that goes into this variable.
I'm going to go ahead and put that subset table in here for coloring, and then I'm going to put in an empty reference at the end here to our subset table. A ll that does is ensure that the last operation is to return that subset table.
Let's go ahead and save this. Let's see if it works. Always good to test. It looks like it did. If it didn't, we could go to the history here, and we can see we did an on-demand data refresh. If it had failed, it would indicate that here, a nd the details pane, which just shows that we got a table back out of this, would instead show an error log from JMP itself.
There's a hidden JMP session behind here doing the query for us, and it would provide the log of that JMP session here, so we could get an idea of what's going on. It looks like we have a valid refresh script at this point. However, I just essentially manually refresh data that I already had. That's not particularly interesting.
Let's also look at this table. Notice that for the query information, it put in a holder for the password because we don't want to transmit those credentials. Most of the time, it also put in one for a user ID. In this particular case, it did not because Workflow Builder doesn't do that. But if you did this directly from Query Builder, it would.
How do we provide the password, though, for this script that's going to need credentials to really fetch new data later on? We validated that this script is going to be okay, but we need the password for new data. What we do here is we're going to use this as sign credentials, and we have this credential that I have created, but I'll show you what you would do if you had none. You create just essentially a username and password pair and a stored credential name.
In this particular case, I already had one , and it did get used, but I'm going to select this radio button anyway to make sure that we understand these two are associated. If we had more than one, we would need to have one selected for it to work.
W hat's going to happen is my username and password, when it finds PWD or UID, if there's this placeholder with percent in, it's going to substitute those credentials in at the time of the query, so o nly when needed and make the query itself. These credentials are stored using secure string technology in the database, which also can be encrypted. They're very secure, and they're only used in memory at the time of the query. We're pretty sure that we're not going to have our credentials breached.
Now, what do we do as far as creating a repeatable operation where we don't have to be around to do this? Well, we use the refresh schedule and hit Create. This is pretty flexible but hopefully easy to use panel.
Right now, it's saying it's going to create a schedule that's going to run on every day. I don't want it to run on Sundays and Saturdays. I'm going to exclude those because the stock market is not open. You can have a repeat on a timely basis, like up to 5 and every 5 minutes. I only want it to repeat once a day, so I'm going to turn that off. When do we want it to start?
If all your servers are operating in your same time zone, then you don't have to worry about this. You would just put in whatever time you're at, well, put in the time you want it to run. I have to have a little more complex calculation i n my case. I'm running on Amazon Web Services, and we run all of our servers on the UTC universal time, which at this point is 5 hours different.
Because I want to show this operating quickly for a demo, I'm going to essentially take the time of my demo and add 5 hours to it and we're going to run that. We're going to put this in as running at 7: 31 PM. When I say okay and save this, it's going to calculate when it's going to run, and it says it's going to run in about a minute. That's what we are hoping for.
Off screen here, I have an update to a database. Let's just pretend that this was done from some operation that's automatic, but I'm going to actually provide an additional stock quote that shows the stock jumping up in price. Maybe I'd come in in the morning, take a look at my new graphics, see the stock went up, and then maybe it's time to sell. We're just waiting on this at this point and hopefully, you'll see that it gets queued up and our report will get regenerated quickly.
While we're waiting on this, I will mention, too, that your refresh schedule can be scheduled to terminate at a certain time. If you want it to end at the end of the year or something like that, you can put that in as well. We saw it refreshed a few seconds ago. Let's go take a look at our report. There's our report. We see it's declining price, and then if we hover, we see that we have a new point here, and it has indeed jumped up in price.
That same script, if we go back here, will run 5 days a week, every day at the same time without our intervention . We've provided our credentials. Everything is automatic at this point , and we realized our ambition to essentially just be able to come in and get a new report every morning without having to worry about anything else. You set it up once, run many times, and we're good.
That is probably the most important example I'm going to show. I'm going to show one more trick, and it's more of a trick that may help you in certain situations. Let's clean this up.
What I'm going to do next is I am going to show using a REST endpoint. You're not familiar with that. Essentially, a lot of organizations make their data available through what looks like a web URL. I have a script that I developed, and I'll describe that in a second.
Often you would need a credential or some thing called an API key or something like that to essentially have permission to use this site. Many of them cost quite a bit of money. This one does not.
You access this URL, and it returns a whole block of data. JMP has a variety of methods that it provides to help you parse this data apart and put it into a data table. That's what we're going to do here. This particular site is one called Eurostats. It has a free API I was able to use.
I'm not going to go into it, but it essentially has this query language that you can append to your URL to tell exactly which table you want and what data points . I have it starting in 2021 and not ending, s o we'll continue to get new data as it becomes available.
It returns in this one big block of data, and JMP knows how to interpret that and turn it into a data table here. The data for the date information comes in as character, and we don't want that. We want numeric, and we want it to be continuous, so there's a little loop here that runs to change the column types after the import.
If I run this , and what we're fetching, by the way, are natural gas prices in the EU. This data is pretty old. I don't know how often they refresh their data, but right now , they're only providing data for the first half of 2022. Hopefully, they'll update at least with the second half early soon.
What do I do? I can look at this column data, and the columns are named specifically for periods of time. I really just want to have a periodic refresh of the data where I just grab the latest one and use it in a report.
Let's look at this. If I were to do a graph builder, this is really handy, though, you can drop this in. The geographic and time period column can help you map out the map of Europe. If I drop in the latest one, unfortunately, there's some missing pieces, but it will give us a general idea of what gas prices averaged in this first half of 2022.
Again, if I were to save this out, you'll see that it's going to use this date specific column. That's not what we desire. A little trick you can use here is to just rename the latest column. Here I have another little scriptlet, and I'll open that up.
Essentially, all I'm going to do here is I'm going to say, let's take the last column and we're going to just rename it most recent. Then if I want, I can create a graph builder script that uses most recent. If I do this, there's our column. Now I can bring up Graph Builder and plop in our geography column, our most recent data. I can maybe enlarge that a little, and we can go ahead and publish that.
Let's go ahead and we're going to just put that in our folder under the name Gas Prices in the EU. I'll publish that. We'll refresh our web page here. There's our gas prices graphic, just like we hoped, and it'll have, if we hover over it, the mean gas price and things like that.
If I look at this data, though, and we go into our Refresh settings, it didn't understand all of our script about getting our REST endpoint and things like that, but that doesn't matter because we already have our script. Over here, I can just copy this as is and put it in here. We will return that table as our last operation.
Maybe I just want to run this one day a week or something like that just so that whenever we finally get one, we periodically review this, and we'll see the latest thing. This is a case where you're just occasionally viewing the data to see if there's any updates.
Of course, we'd want to make sure manually that this works before we move on to other things, but we can see that indeed it did the report, even though it has been updated a few seconds ago . Even though the data is the same, we know it's working so that when new data does come out, we will grab it and populate this graphic with it.
A gain, this is on devlive 17.j mp. com. You will need a SaaS profile ID to log into the server if you want to look at this, but I will leave it out there for you to take a look. That it concludes our data refresh examples. I hope this gives you an idea of some of the powerful new capabilities that JMP Live provides. I appreciate you attending this talk. Thank you.
Transcribed with Happy Scribe