In this installment of the Help! Ross Metusalem shows you how to use sample data and prebuilt graphs to teach yourself a variety of analysis techniques. Fully supported by documentation from JMP documentation, you can use the Sample Data Library, to learn more about JMP functionality, using sample scripts, teaching scripts, sample dashboards and so much more.
Full Transcript (Automatically Generated)
Everybody, welcome to this edition of Help review of jmps help resources and how to use them effectively. I'm Ross to slim systems engineer and today we're going to talk about the sample data library, which is a repository of example data sets and pre built analyses that you can use to get some hands on hands on practice with jump. It's great for better understanding jumps capabilities and for honing your analytic skills. So let's take a look at an example around the partition platform. So for those of you who are unfamiliar with the partition platform, it is a tool for building decision tree models which are great for data mining and predictive modeling. If you're new to the partition platform, you might start here at the documentation to read up about them. After that, you may head on over to something like mastering jump to see an instructional video, but eventually, you're going to want to get Get some hands on practice. After all, it's a big leap from watching an instructional text or video to actually trying something out yourself. So we want to get some hands on practice with partition models, but we need some data. To get that practice. Let's check out the sample data library to see what it has for us.
I'm gonna go to help sample data. The window that's popped up here is an index with links to all of the sample data sets that are built into jump. So you get these automatically when you install jump. On the right, you can see we have data organized by domain. And on the left where we're going to play, we see data organized by type of analysis. For example, if I wanted to work on analysis of variance, I could check out this section where we see various data sets for different types of ANOVAs.
I'm interested in partition models. I happen to know that those live here under exploratory modeling but If I don't, that's okay. I can always go to search and find or just to Ctrl F. And with partition type then if I click find you can see it's found the first one the Boston housing data set. we cycled down we have car pole serial, the diamonds data that Julian started out with today, and a couple others as well. So numerous datasets we can use to get some practice with partition models. Let's open up the Titanic passengers dataset to see what's in there. Here we have a standard jump data table. When you open one of these sample data tables, you may notice the number of items in the tables pane in the top left here. This could include references or notes if if you find anything like that there, I highly recommend checking this stuff out because it will provide additional context. So if I hover over notes here, it'll pull up some text
or should pull up some text. Oh, I didn't have that selected. Well, sometimes when I'm impatient, I'll just click Edit. and here we can see this data table describes the survival status of passengers on the Titanic. Very good. So I want to practice partition models. I'll go to analyze, predictive modeling partition, because I've read up in the documentation that that's where the partition platform lends. Now, what do I do? Well, if I'm brand new to partition models, I might look at this launch dialog and say, you know, I'm not exactly sure where to put all these variables to make sure I get a valid analysis. So luckily, this data table has come with a few pre built analyses in the form of table scripts. This first one here is for the partition platform. So I'll click the green button to run the analysis. So now we actually have bypassed that launch window and we know that we have a valid partition analysis. If I want to see how that launch window is actually filled out. I can go to the red triangle, and under redo, relaunch analysis and I can see that okay, we took survived, that's our response variable and put it into the Y response. Roll. And then we took several factors and put them in X factor.
So this is a nice way actually, as I've learned to jump throughout the years, I've often found myself launching prebuilt analyses in the sample data library and then doing this redo relaunch trick to see how they were set up in the first place. So now I have this analysis. And if I wanted, I could, you know, play around with it, explore it a little more request a couple splits in my tree, maybe to find that, Oh, well, it looks like females at a much higher survival rate than males. And in particular, those females were in passenger, class one and two. So this is this is pretty nice. It's a great way to kind of poke around partition platform, learn more about it. But you know, there are a lot of options maybe I don't know what the split and prune buttons do necessarily, or what some of these options under the red triangle do. Could be the case that you don't feel entirely comfortable just diving right into a data set. And you'd like some more kind of hands on guides. One thing that's really nice about the data that you find in the sample data library It's referenced in the support documentation, oftentimes with step by step examples. So let me pull up the partition documentation. And if we look over to the side here, you can see we have an example. When I click that, I actually have a step by step example of launching and using the partition platform to build a decision tree model and interpret the results. You can see the first step here says, Go to help sample data library and pull up the diabetes data set. And so this brings me to something I want to point out here.
When I first went to help, I went down to sample data. But if you were watching, you may notice someone write over something that said sample data library. So what's the difference between sample data and sample data library? Well, the library is just referring really to the folder on my computer where all these sample data sets live. And so if I go to sample data library, I'm in this case in my Mac Finder window, and I can just scroll on down in this case and find that diabetes data set to work with for the exam. In the documentation, so if you want some more kind of hands on guidance, before diving into a dataset on your own, go ahead and head to the documentation, look for something that says example of partition platform or anything else. And go ahead and follow the step by step instructions first. That's the basics of the sample data library. Before we go, I do just want to point out that it actually contains more than just data tables. If you look at the buttons up here, you're going to see buttons to open various other directories, including a directory of sample scripts, or sample dashboards. So these, again, are just kind of pre built materials that you can use to poke around and hone your skills. So that's, you know, the sample data library. It's what it's all about. It's providing you the materials you can use to get some hands on practice with jump to better understand jump capabilities and to hone your analysis skills. So that's it for help today, Julian, I'll kick it back to you.