There are many ways of generating a model such as basic linear regression, decision trees, neural nets, and generalized linear models. JMP Pro can be a great tool for data miners, those who want to get more information out of their data and build more accurate predictive models. It incorporates traditionally complicated algorithms that, in true JMP fashion, even a novice can harness to quickly build powerful models. Bootstrap forests, boosted trees and multilayered-boosted neural nets are just a sample of the powerful tools in JMP Pro. One thing that makes JMP Pro a powerful data mining tool is ability to build, tune and test your model in one step.
A while back, this blog featured a great post on the concept of training, validation and test sets to build your models. To build a model that is not only descriptive but also predictive, validating your model and subsequently testing it is essential.
The first step, as described in the blog entry, was to first split your data into three groups: training, validation and test portions. Next, you build a model based on the training portion of your data that has captured the behavior of the data (your system) and not just the noise. Once you've come up with a legitimate model, you apply that model to the validation set. If you've built a model that captures the underlying behavior, you should get similar behavior. If you do not, you have to go back and build another model; this process may go back and forth a few times. Once you get a model that gives similar behavior in the validation built with the training set, you expect the model to be repeatable so you apply the model to the test set.
JMP Pro is a huge breakthrough in simplifying the process. The first enhancement introduced is a much easier way of generating a validation column. From JMP, choose Cols -> New Column, Initialize Data -> Random -> Random Indicator. Random Indicator column will default to values 0 (Training) , 1 (Validation) and 2 (Test). You can now choose what portion of the data you want to use for each step.
Most of the modeling platforms in JMP Pro now have a option to input a validation column. This allows you to complete the long process of building, tuning and testing your models, as I just described above, in one easy step. As you're inputting your variables, choose your newly made column as the validation option.
One fact of statistics is that not all modeling techniques work well for all data sets. Each technique has its strengths and weaknesses; each one can teach you something different about your data. One strength of JMP Pro is that it offers multiple modeling techniques under one roof. You can try each algorithm, evaluate its performance and even save the model (prediction formula) to the data table as a new column. JMP Pro Version 10 now allows you to compare your models once they've been built.
Run the Model Comparison platform and choose the models you want to compare (or let JMP choose them).
In this example, the Bootstrap Forest seems to have generated the best model. If multiple models have similar statistics, another method to view the models is to use the interactive Profiler platform. It not only allows you to develop an intuitive understanding of how the model works, but it also shows how varying multiple inputs can affect the output. You can download the data I used in this blog post from the JMP File Exchange and try it out yourself! (Note: Download requires a free SAS profile.)
... View more
One of the key parts of any analysis is to be able to communicate your findings in an organized manner. JMP has a variety of ways to save your analyses and share them with your colleagues. I've recently stepped through a number of ways to work with your analyses in a Webcast. True to JMP’s philosophy, every analysis has two parts: the visualizations and the statistics. Depending on the method you choose, you may have more or less freedom to edit either part or both parts of the analysis. In the table below, I’ve summarized the different options within JMP and the benefits of each one. One thing to keep in mind is that the longer you keep an analysis within JMP, the longer you’ll have JMP functionalities -- for example, live data linking, the opening and closing of Outline items (blue triangles), and plotting points as different colors or marker types. Once you’ve taken an image or text out of JMP, you are limited to the capabilities of whatever program you bring it into to make any changes or edits. I’ve labeled a few of the options as JMP Systems Engineer favorites. We are often interfacing with new and experienced users and have a few methodologies that seem to be most useful to all users. Saving Analyses + - File Extension/ Path to Saving Report No additional files Easy to organize JMP editing capabilities Need access to data table Hot Spot -> Script -> Save script to data table (.jmp) * Retains data interactivity No need to change data table JMP editing capabilities New file to keep track of Location of files is hard coded(default)† .jrp String multiple historic reports into a journal/layout Retain some JMP functionality No live data link Edit-> Journal Edit-> Layout Choice of image file type Editable text No live data link .html, .rtf Editable text No live data link No visualizations .txt Snapshot Not easily editable .jpg, .gif Editable text Vector images Can update images 3 rd party program limitations when editing Paste special -> enhanced metafile *‡ Snapshot Part of your Word document or Powerpoint presentation Not easily editable Paste special -> bitmap *‡ Journal Multiple ordered analyses Embedded links to any file types Location of files is hard coded(default)† .jrn Multiple ordered analyses Snapshot No links .jpg, .gif Project Multiple journals Any file types Totally archivable JMP and 3rd party files Provides organizing structure that retains data interactivity Windows only Mac accessible .jmpprj * † The location of the file can be edited and made relative. For a .jrp, right click on the file, choose Open With -> Notepad (or any text editor). In a journal, right click on the file link and choose Set Script. Erase the path to the filename and leave just the file name. For example if you have the file “C:DataBig Class.jmp”, leave only “Big Class.jmp”. Now, the .jrp or journal will look for the Big Class.jmp file in the directory that the .jrp or journal is located in. ‡ First, use Selection tool , right click and choose Copy, switch to a program such as MS Word or Powerpoint. From here you can choose Paste Special. * JMP Systems Engineer Favorite!
... View more