Configuring, Connecting, and Collaborating with Data Connectors in JMP 18

Getting data into JMP is a core part of the analytic workflow, so JMP 18 introduces new data connectors to broaden the range of third-party data sources that JMP can access. JMP 18 allows more detailed configuration of ODBC data sources and makes it easier to share those configurations with colleagues across platforms and publish to JMP Live. We discuss data connectors, not only as they exist in JMP 18, but also plans for extending them in future versions of JMP.

Good morning or happy whatever time of day it may be as you watch this. My name is Neal Siekierski. I'm a software developer here at JMP. I'm one of the main developers on our Data Connector initiative here. I'm here today to talk about that data connector initiative, what it means for JMP 18 and what it will mean going forward as well.

We have these three Cs here. This is what we want to really achieve for Data Connect. We want to make it easy to connect to more data sources than before. We want to make it possible to also share those connections more easily. Both with your coworkers who may be on the same platform as you, Mac or Windows or different platform, to share those with JMP Live. While we're giving you access to more data sources, we also want to make it possible to do all of that in one common interface.

With JMP 18. We couldn't do all of this, so instead what we did was focus on ODBC. ODBC is a technology that JMP uses to talk to remote data sources, databases like SQL server or MySQL and many others. We picked ODBC because as it is something that already accesses multiple kinds of databases, there was room to make it possible to access even more.

There are some hurdles that currently make that a little difficult. That's another reason we looked at that. For example, one of the sources of difficulties we see is in SQL code. SQL code is a programming language that's used to get data from databases, and JMP has to create that code, send it through the ODBC API in order to get results back and show you your data.

Not all databases have exactly identical SQL code. For example, one database Postgres, we're getting a couple columns from a table, in this case our favorite big class. It would look like this. But the same query for MySQL and other database will look like this. You can see Postgres. We have like double quotes, MySQL, we have backticks and SQL server's different again.

Because of this right JMP needs to know what database it's talking to. We can get that information. But in order to be able to actually use it, JMP needs to know about each type of database. If JMP doesn't know about that type of database, we have a particular fallback behavior. If it works, then that's great. And we try to make it a reasonable set of defaults, but it might not work. In that case, you can't really use Query Builder with your data source.

Another source of difficulty is the Connection strings that are used. In order to connect to a database. JMP has to give ODBC a string that tells it a variety of information about where it's connecting to, how to connect, and a very simple connection string would look like this. We have, for instance, the server that we're connecting to, JMP up to is a test machine that we have access to here at JMP. We'll be seeing more of that.

But more importantly, we also have a driver. A driver is a particular piece of software that plugs into ODBC to give you access to a particular data source. ODBC itself doesn't really know about data sources, and just like pieces of software you have, you can't install them on Mac and Windows. Or if you do, it's, you know, two different downloads. It's two different pieces of software. It's very similar for ODBC.

Actual SQL server is a driver for SQL server that's created by a company called Actual, and it's for Macs. If I had a connection that used this connection string and wanted to share this with somebody else, but they were on Windows, or they were using JMP Live, then they would have to change it.

To do that, they'd need to know what the Windows driver is. In this case, it's called SQL server. I can run this on my Windows machine and it works fine. But again, you have to know what that database is. You have to know its corresponding driver on each platform, and that makes sharing difficult.

Between, the differences between databases requiring JMPs, knowledge about them, and the different connection differences and connection strings. It can be difficult to share connections, and it can be difficult to connect all the data sources you might want to.

In order to deal with this, we have the data connector framework that we've introduced internally in JMP 18, and we also have a specific application in ODBC.

The core ideas of those are that we've taken all of the options that JMP sets internally based on the connection type. SQL server is one set of options. Postgres needs another set of options. We've exposed all of those and made them all individually configurable.

That way if you have a database that's like SQL server in one way, but maybe different in another, you can change an option and that will work. Those options aren't just things in the preferences' menu. These are configurations that can be named, and then we can share them.

You can get a whole set of configuration. You can give it a name. Then whenever you want to reference it to connect to a particular database, you can if you want to connect somewhere else, you'd use another one. Because they're named, it also makes it possible, more possible to share those connections.

Finally, those settings that you do make, they can be platform dependent. We just saw how drivers might need to be different or other settings.

If it needs to be one thing on Mac and nothing on Windows, it's possible to do that. Or if you have something more fine-grained, it might even be possible to do that. Those are the core ideas. Now I want to talk about them more, but I won't do that by showing everything in action.

Here's JMP. If we go to file database we see there's a new menu item Data connectors. If we click that we see a window here which if you've ever used Query Builder looks pretty familiar. If we go to New Connection we see a list of configurations.

There's a few things to note here. All these ones here at the top, these are built in configurations, so they all start with com.jmp. They're supplied by us. They correspond to the various databases that we supported internally and now have also exposed in this way.

These aren't maybe directly useful for connecting. They don't have information like a server or anything, but they are still useful as the basis for creating new configurations. Below this we see a bunch of other configurations.

Some of these I created myself. Most of them, however, I did not. JMP created them automatically from the connections that ODBC manages, which are called Dsns domain source names. We can see here a list of the connections, the Dsns that I have on my Mac, and we see a number of these back in JMP as well.

These are created automatically so that it's easier to get started. Let me close this. Since we have these on my created connections, you know, ideally the sort of easiest thing that we would be able to do is that we could go and just click connect.

This doesn't actually work. What we see is the error message from the driver. It says we didn't supply a user ID. That's information that JMP wasn't able to collect when it was converting this configuration.

We are however offered the opportunity here. The dialog is suggesting that we create a new configuration, so that we can sort of take this one. Add what we need in this case a username and password. Then we can try to connect. So let's hit yes.

This is the connector editor. We'll talk about it in more detail later. But for the moment let's just add the information we need. We need a password. There wasn't a password before. Now we're going to have a password. I'm going to fill that in. We need a username as well. So same idea.

Now that we've filled that in, we can try to test the connection. We see that it has succeeded. Now that we have something that works before we go ahead and connect, we're going to save a copy. This will save us the trouble of doing this again later. The config here is that we've configured the connection, so I've saved it now okay. To connect.

We see a list of schemas here. If I select one of these schemas we see a list of tables. We're just going to go ahead and pick the one true table big class. Now the easiest thing we can do is just go and open the table, so if we click open table we see we have our table come up. Great.

But we can also use Query Builder. Let me close this, and now we can click the Query Builder button instead. In Query Builder comes up, we'll see again, same interface as before.

We can build our query. I can keep stepping through this. Add some columns, run the query, and then once the query finishes we see our table again. Now since we've our table we've seen that all work. Let's say we're doing this again right.

We've saved a connection, so we should be able to use that. Let's disconnect this. Now we can go. We can find the one we saved. We can connect, and so we've connected now, and we see everything again.

Let's go ahead and open up our table again. That we've opened up the table we might want to do something else with it. I talked about sharing, sharing both with your coworkers and sharing with JMP Live. Let's start with that second one and share to JMP Live.

I'm going to publish the data table. This has come up, so we have our table. Let's click next. We're just going to publish it to my personal space, and now it's been published, let's close that and let's go over to JMP Live.

Here's my personal space I'm going to refresh. We now see our table, so pull it up. Wait for it to load, and now if you go to settings we have the option to refresh your data with the script. This is where one of the differences from Query Builder can be seen. Before we would need a connection string to set up a refresh script. Now if we go ahead and use the source script, we'll see that we see something a little different.

I'm not going to look at this too much right now. I just want to see if it will work, so I'll save it. I will assign some credentials, and then I will go ahead and. I will click refresh. We'll see if it works. It failed. But let's see why. We get an error here. This is a little hard to understand, but what's getting at is that we didn't specify a valid driver.

Let's go look at our script, a little more detailed to figure out what's going on. If we look we see a line here driver actual open source databases. We are specifying a driver. But you see the actual again it's one of those Mac only drivers, so it's no surprise it doesn't work.

Now again before, what we would have to do is figure out what the name of the Windows driver is, substitute that in here, and then we could try again. But in this case I don't have to figure out what it is. I'm going to delete it. I'm going to save it, and we're going to try it again. Looks like it succeeded, so I'll reload it.

There's not really anything different to see here. The data didn't change, but we really saw that refreshed and now. The reason that worked is that. Again, let's go look at our script again. We are using this built in configuration. We've added these additional configuration items so that it matches the configuration that was created by JMP.

Generally speaking whenever we publish JMP live we'll set up the source script so that we've created something that is equivalent to the configuration that you specified.

The reason for the difference. The reason we don't just say ID JMP dev to pkg-config is that JMP live doesn't know about that particular name. To make it more compatible with JMP Live, we're creating again an equivalent configuration, starting with one of the built-in ones that we know is on JMP Live and adding everything else to that.

This built-in one makes things work. Like I said, one of the core ideas of data connectors is that we can have platform-dependent configuration. We see that with this built-in configuration, the built-in Postgres configuration, where it will automatically look for a appropriate driver on platform where it's running.

On my Mac it would find that actual open source databases driver on Windows, it would find the Postgres driver for windows. All of the built-in ones act like that.

This also makes it. This also increases the ease of sharing between Mac and Windows, and between machines that might have different sets of drivers installed. That's JMP live. Let's go back to JMP and we can close this out. Let's disconnect this, so we're on a clean slate. Now that's configuration. That was created by JMP.

Right now what if we wanted to create a configuration ourselves. We didn't want to go through the ODBC manager ask JMP to pick it up and then show it here. Let's look at that at the same time. Talk about things in more detail. This time let's connect to SQL server. That's this configuration here.

And if I click new what that's going to do is let me create a new configuration that starts with this one, so I don't have to specify every little detail about SQL server. I just need to specify the stuff that's unique to my connection. As you bring this up you see the editor again. And now let's look at this a little more detail.

First up we have the type of the connection and JMP 18. We only have ODBC. Then we have the base of the connection, so as I said we're starting with this built-in SQL server configuration. We see here that we have SQL server our built-in SQL server configuration. And specifically we're identifying what we want with that. There's other options. We can start completely from scratch just with the defaults for ODBC in general.

We could specify path to a file. We could even specify a configuration to use as the base in line. But we expect generally speaking you're just going to be using another one, so that's going to be this ID here. Then below that we see all the options, so these are grouped, so this first group is connection information.

Where you're connecting to what your credentials are. The stuff that lives in the connection string for ODBC. Below this are other options, so we see a variety of options for SQL, and we have options just for the underlying ODBC layer as well.

If I go to Connection here, we actually want to make a connection. I'm going to go and fill in the information we need. Database name is SQBTest. You can see that when I did that this checkbox here became unchecked. This is indicating whether we're using the value that we got from this base configuration.

Since I've changed it or not the checkbox is unchecked. I can recheck it, and it will revert back to the base value. But we do need that there. I'm going to fill it in again.

We see the driver here for sharing until JMP 19. I do need to do this, so we'll do that real quick. Password value there. Our server, it's going to be jmpdev2 again. We have our username. Let's see if that connects. It does. But before we go any further, let's consider a scenario here. Say that I'm not actually connecting to SQL Server. Say it's something very similar to SQL Server but not quite identical.

The way it differs is that if you remember SQL Server uses the square brackets. Say it uses double quotes instead like Postgres does. If we go down into all these SQL options, we see a lot of different stuff here. But let's go and look at the quote type. See, it's currently specified as brackets. If I change that to double quotes, then you know we'll have match what our database needs.

Conveniently for this demo, SQL Server actually supports both of them. Square brackets are its native thing, but the double quotes are more the standard. Because it supports both, we'll go ahead and see this in action. Let's save this. It's going to be my SQL Server. Let's connect.

Before we can see tables that we have, we can pick out a class, for instance. If we go through Query Builder. Look for that to come up, we will see that the SQL code it generates reflects our choice.

Normally, again, you would expect the square quotes the square brackets. But now we have the double quotes. We can also see that this works. If I go ahead and run the query, we get our table. Again, I have created a configuration not from scratch necessarily, but starting from the base ones. I've only had to provide a few options, and now I have a fully functioning connection.

I was even able to tweak some of the behavior to be a little different. Now I've just done all this hard work. What if someone else in my organization wants to be able to take advantage of that? It's possible to share these connections not with just JMP Live as I showed, but also with others.

To do that, we have this little red triangle menu here. If we click on that, we see we have an option to share connectors. I'm going to bring this up. You can see a list of our configurations on the left here. The ones we have here are excluding the built-in ones because there's no need to share those.

If we select our configuration, we'll see it show up here on the right. I double-clicked to do that, but we also have these buttons, so I can click this one to move it or rather remove it. I click this to add it again. In order to share we are going to be creating an add-in. We're going to use JMP's add functionality in order to distribute these configurations and make them easy to install.

But because it's an add-in, all add-ins need an ID. That's what this box down here is. For defaulting it to the hostname of your machine with the subcomponents reversed. The address of my machine is this mh382.mmt.com. We've spun it around, com.sas.mmt.mh302. That's going to be the base. Then since I've only added one configuration it's just going to use the name of that. But if I added another one, for example, then it would just be data connectors.

But let's remove that. Let's click Save. It's defaulting the name to the full ID plus JMP add-in. But we don't need to do that. We can make it a little friendlier, so we're going to go ahead and just call it Server. In fact, we can do a little better. We don't have to use this built-in name. Let's just cancel that out. Let's change the name. Let's say for example company, we're going to use my username. Instead of just using this name again we're going to say what it's for us for this year's Discovery Conference.

Hit Save. We're going to rename it something a little friendlier. Save it out, and now we're being asked if we want to keep the password in. Since we expect that your password is not necessarily something you're going to want to share with other people, we find an option to help you remove it automatically. In this case, let's go ahead and remove the password.

You can see it's saved. We see my desktop now has this add-in on it. But now what do we do with it. Well you have to someone else. In this case, instead of giving me someone else, I'm going to give it to myself. But over on Windows. Let's switch to Windows. We have to log in here real quick, and here we are. Here's JMP. We need to find this add-in which I have my downloads' folder here. Now I have this add-in right if I open it up and JMP, and install it.

All right, and now see in action we're going to go File, Database, Data connectors. We're going to bring up our configurations. We see that here it is. Try connecting. We see it's failed to connect because login failed. Well that makes sense. We moved our password. Let's say yes we want to configure it some more. Let's add that password back. You can test it. See that it works. We hit OK. We see we've connected.

We've created [inaudible 00:26:10] Mac. We've brought it over to Windows, and we're able to connect it. We'll see all the same stuff make that all work. That's sharing. That's configurations from scratch. But how does this all work underneath. Let's go back to JMP here. We see this under SQL Server built-in figuration. We see these other configurations that we've created.

Well what do they look like? Where do they live? For the built-in configurations those live in a folder that's part of the JMP installation, but ones that you create, those are going to be in JMP's user data area with all of your other data. If I navigate here to the library folder in my user folder on Mac, go to Application Support and JMP on Windows.

This is a folder under App Data. You can see that along with a bunch of other JMP specific information, we have like version specific stuff. We have Python stuff. We also have a data connectors folder.

If I open that up, I see files that correspond to all the configurations that I've made and the JMP's made from ODBC on my behalf. For example, if we look we see this MySQL Server configuration, and we have a JMP DC extension that stands for JMP Data connector. Aside from all these JMP DC files, we also have this JMPDataConnectorRegistry.JSON file. That contains information on all of these is what maps that name the MySQL Server for instance, to this specific file.

It can also contain some other information. But let's look at what this is inside. We're just going to use a text editor here and now, sorry, that was the wrong one. I've named two very similar ones, so we're going to open up this one. What we see is a lot of text here that looks like JSL, and that's because it is JSL or rather, it's a subset of JSL.

Here we have, on the left we have our option names, and on the right we have their values. This first line here is special. It's telling us that our base is the one that's going to be identified by com.JMP.SQLServer. We can see are there settings in it. We can see that this server we've set is their database, our modification to the pro type.

Generally speaking these option names on the left map very closely to what's in the editor. On the right here we have the values that are again going to resemble what is in the editor. Here we don't have the little text at the end that gives an example. More notably for the username and password we have these encoded strings here. These we've stored like this so that it's you know not directly storing your username and password.

But JMP was able to connect with them all the same, and as I said, this is JSL. One way we see that is if we want to add platform specific configuration, we've seen that for built-in ones. But if we want to do it ourselves, say we want to take, let's pick an option we haven't talked about yet. Block fetching. It's something that makes it more efficient to get data if you can enable it, but sometimes it isn't supported or doesn't work properly.

Say we have it working here on my Mac, but not on Windows. We want to have it on for Windows, for Mac, but otherwise we want it to be off. With a line like this where I've set the Block Fetch option to this JSL expression, that will have the desired effect, because all of these values on the right are just JSL code.

Whenever JMP loads up this configuration, it's going to evaluate this. It's going to say, "Oh yes," in this case we're on a Mac. We're going to have Block fetching turned on. That's how that will work. Yes, and JMP itself you can use JSL to work with these. Let's close this because it's out of the way.

Let's just close all this up. Here we are back and JMP. Let's open up a script editor here. It's boilerplate. What can I do? Well I can load up the configuration from a file. It's going to be in memory data connector. Again let's look at the… We've been working with this MySQL Server. This line here is going to load it up. It's going to give us that configuration. Then I can do things like get specific values from it, so I can get that, the Block Fetch option that we set.

Or I could get the quote type. I can change options, so I could set the quote type back to the value I had before, which is brackets. I could just turn the Block Fetch off. Now we can print it again to see that it's having the effect we want. If I run all of this, we see that with that the Block Fetch is on. Quote type is double. Then after I set it, things are off.

That's some of what you can do to manipulate this. If we look at the whole thing, what happens is we will see all the values that are not their default values. For example, we see our username and password. But we see a bunch of other options where SQL Server is different from the default. See all that there. The registry. That mapping from names to actual configurations is also accessible through JSL.

I have a variable here. Then I can do things like get all the available entries. This is going to list both again the ones that we created and the ones that are built into JMP as well. If we were doing this on Windows, we'd see the one installed through the add-in as well. We can get metadata about a particular entry. We can again look at a configuration that we've created here. That will tell us both the name and the path. That essential information, but it also includes the type, and because it's ODBC, it's going to be having the driver as well.

You can also register a configuration for JSL. Say we wanted to write out our own, and we're going to call it MySQLServer2DC. We've added a new path variable. For those of you who are familiar with those. This points right into that data connectors folder that I was showing earlier. I can write out some text there.

We're just going to make it simple. We're going to start with our base configuration there this MySQL Server. We are going to set let's say some type back to what it was. Now I have this. Finally, with my follow my path, I'll register it. The name is going to be matching the file name, just like we were doing with the others. Then the path. Path. If I run all of this, then we can try loading it up.

Here we're just going to be like we were doing before, but with the two. We can see that work, that print out something here. We could even get the value we changed. See that that's there. All right, so again, we've seen that, and we can both use JSL to manipulate configurations. If we're willing to work with those configurations directly, we can see where they are on the file system, and we can add platform specific configuration.

That's a lot that I've talked about. Let's wrap things up a bit. Let's go back to the beginning. I said that we had specific goals for this initiative. Right that let's go through the one by one. That first one was that it would be possible to connect to more data sources. For JMP 18, we expanded data source access within ODBC. I've hoped that by, for example, changing the JSON type for SQL Server, I've given you a bit of a taste of what's possible there.

We've increased the ability to be compatible with your data sources. In terms of sharing, I hope that showing the sharing to JMP Live to Windows has, I think, made clear what we can do there. Finally, more uniform interaction. This is one we really haven't seen. I've shown you a new interface. I've shown you something that only works with ODBC. This is where it's important to remember that we focus on ODBC, and we were laying the groundwork for other things. This is something that we'll see in the future.

To talk about the future in JMP 18.1, we have some minor enhancements that are planned mainly to the editor. JMP 19, we're going to have more enhancements to the editor. But I think more importantly, we're going to have new types of connections. We're going to take something that's currently only for ODBC, but also expand it to other things. The first of those things I want to mention is SAS.

In JMP 18 we had to remove the SAS connectivity. But in JMP 19 we're bringing it back, and it's going to be integrated into this data connector framework. That means that the same interface for connecting the same editor for editing configurations and the same ability to share those configurations, like with add-ins. We'll be also adding connectivity to Amazon S3 and Microsoft SharePoint. Something that I think people have been requesting for a while now.

The interesting thing about those two, s3 and SharePoint is that they are not going to be distributed with JMP. While SAS, we're aiming for something that's along the lines of what we had prior to JMP 18. It's going to be distributed with JMP. You're going to have access to a lot of the GSL functionality that used to be there.

For S3 and SharePoint, those are not going to be packaged with JMP, but they're going to be distributed via the JMP marketplace. Which some of you may have heard of. If you haven't, you'll probably hear about it elsewhere at discovery.

But in short, it's going to be a way to distribute extensions to JMP. We're going to be distributing those, the S3 and SharePoint connectivity, that way. In order to get those to work with JMP we're going to be defining a Python API exposed from JMP that can be used to interface with everything that's needed to plug into that connector framework and then go to Query Builder and beyond.

In order to implement these, we use that interface, and we also used existing Python libraries. That's part of the wonder of the Python ecosystem, and also the fruits of the work that we've done to improve the Python integration for JMP 18. But that's not something that's going to be restricted to us.

That interface is something we're going to expose, and we're going to document and something that's going to let you know not only us, but also you create new types of connections and have them integrated with JMP integrated with Query Builder. You can use all the same tools.

That's what we're going to do for the future. That's basically all I have. I want to again say that I hope that what we've done in 18you know, not only has laid the groundwork for future improvements like the ones I'm discussing here, but also in itself provides a compelling upgrade to what we can do with ODBC.

That it's possible to connect to more data sources, that you can share those connections easier. Again, I think the future is going to be even better in terms of what we can do. I look forward to what we can do there. With that, I want to say thank you for listening in, and I hope you enjoy the rest of discovery.

Presented At Discovery Summit 2024

Presenters

Skill level

Intermediate

Beginner
Intermediate
Advanced

Files

Slides.pptx

Configuring, Connecting, and Collaborating with Data Connectors in JMP 18

Presenters

Skill level

Files

Data Access