Try This Easy Way to Learn the JMP Partition Platform
Apr 17, 2008 10:00 AM
Marie Gaudard, Phil Ramsey and Mia Stephens have taught JMP and used it in their North Haven Group Six Sigma consulting practice since the release of JMP Version 4 in the 1990s. All three are strong believers in the value of the JMP Partition platform for novice to expert users.
Marie is an ardent data miner. She recently added a new file to our File Exchange. It‘s a JMP data table that will help you learn how to use partitioning to mine data.
The problem is easy to understand: Some print jobs are ruined by a band of ink on the pages, and the production team wants to identify the factors that may be causing the ‘banding problem’.
The data is easy to use: Marie provides data on more than 500 print runs and she includes embedded scripts that get you started.
There is a tutorial to help you along: North Haven Group wrote a white paper that gives background on data mining and partitioning. The paper is in the form of a tutorial for implementing the techniques using JMP.
I talked with Marie last week about why she likes JMP’s Partition platform.
Marie: It is intuitive, powerful, easy to understand and users love it! It’s powerful because it handles large amounts of data and it’s trustworthy because it examines a very large number of possible splits and picks the optimum one. It is especially useful when the explanatory variables are nominal and have many levels.
Me: Is it for JMP power users only?
Marie: Emphatically, no! Wearing my trainer’s hat, I love it because it is so easy for my clients to use, and it gives them incredible insight into their data. We just teach a new user the basics of opening, saving, and navigating the interface and about the convention of the ‘red triangle’. The red triangle reveals lots of options available to them after they do their first split – options like small tree views and a leaf report. They just click SPLIT to see, in a graphical tree view, the variables that are most likely to affect the outcome in which they are interested, and the nodes that describe how the variables are related to the outcome. Then, they can easily click PRUNE when they want to reverse the operation. Your readers will know what I mean as soon as they open the data and run our scripts.
Me: How does it compare to other data mining tools you’ve used?
Marie: Many other data mining tools that do this kind of analysis are largely inflexible. For example, they give you the final tree based on their built-in stopping rules. But you know more about your data and even about the constraints of the organization that you must consider. JMP lets you lock out a variable that may be interesting, but not useful for understanding the problem. Then you can, very easily, go back and split on other variables that may be more valuable. You can also split at specific nodes. We find this very valuable for gaining a deeper understanding of the data. And you can decide when to stop splitting, based on knowledge of the process or using criteria provided by the Partition platform itself.