JMP Blog

MikeD_Anderson · Jun 26, 2019 07:07 PM

I wish I had been told how to navigate the Analyze menu.The Analyze menu is an enigma wrapped in a riddle, wrapped in duct tape ... or at least I thought so on my first day. Let's deal with that one next.

If you're just tuning into the series, here's a brief recap:

- I'm listing five* things I wish someone had told me on Day 1 using JMP.
- Thing 1: Always start with a graph.
- Thing 2: The four-step JMP workflow everyone should know.

And now on to Thing 3...

How to navigate the Analyze menu

The Analyze menu can seem confusing. When I started using JMP, I was continually asking myself, "Why is that there??" (If I were 100% truthful about it, my question was usually a lot more colorful than that, but you get the idea.) It wasn't until I began teaching others to use JMP that it finally clicked – there is a method to this apparent madness!! All you need is one little mental trick, and you can understand where everything is and why it is where it is (most of the time).

Let's think about how people generally analyze data ... well, let's think about how they should analyze data. When people get a new data set, the first thing they should do is create a graph of some sort. If they don't want to make any assumptions about relationships, they would use histograms. If they looked at the first item in the Analyze menu, they'd find something called Distribution. Can you guess what chart they would get for free in Distribution? Right! It's a histogram! This would help them understand the shape of the data, look for quality issues, etc. (more on that in the next bit).

Next, after studying the distributions of their data, they should start to consider how things might relate. This would usually start by comparing one response and one factor – bivariate relationships. They may want to look at ANOVAs, logistic relationships, lines of fit, etc. That's all in Fit Y by X. In short, if the question you are trying to answer involves one column of data, then choose the first position on the Analyze menu (Distribution). If the question you are trying to answer involves two columns of data in relationship in some way, choose the second position on the Analyze menu (Fit Y by X). Dozens of appropriate statistics almost magically appear if you map the first two items on the Analyze menu with the number of columns referenced in your question.

It would seem that, at first, the pattern is to arrange the menu by order of how many variables are being compared. After Fit Y by X, that pattern appears to break. The trick to understanding the Analyze menu is that it's not ordered by the number of variables being considered, but by the complexity of the problems the menu items address. That's why Consumer Research is at the bottom – people, as a whole, are nuts, and data involving people is, invariably, incredibly complex.

So, that would seem to explain the top and bottom part of the menu, but what about the middle? Tabulate (and Text Explorer) appear to be in an odd spot. However, if you consider that they are primarily transformative tools, their position makes more sense. Tabulate is a great way to summarize data for further analysis. Text Explorer's job is to take free text and convert it into something that the other analytical methods can handle. So, they are bridging tools used to go from one level of analytical complexity to another.

Here's a little graphic to help you remember:

Analyze Menu Annotation.png

Now you try

OK, let's work through a couple of thought examples.

If you were to need to compare one variable to a target, where would you look in the Analyze menu? If you said Distribution, you would be correct. You're looking at a simple comparison (top part of the list), and it involves one variable (column of data) – which is Distribution. The one sample t-test is an option under the Red Triangle (Answer Button) in Distribution.

Now, what about if you wanted to get correlation coefficients for all the continuous variables in a data table? If you said Multivariate (under Multivariate Methods), you'd be correct. The primary comparison is bivariate, so you could do it pairwise in Fit Y by X. But, since you're doing it over a lot of columns, that is really a more complex problem. Therefore, we'd look lower in the Analyze menu. Multivariate sounds kind of like "multiple variables" – maybe you think this is more of a multivariate situation. Looking in Multivariate Methods, things start looking pretty complicated really quickly, so a reasonable first guess would be the Multivariate platform. Running that platform will give you several ways to get correlation coefficients for all the combinations of the variables you provided.

Again, the point here is that once you understand that layout, you've got a decent idea of where things should be or at least can move from WAG to SWAG territory.

Tune in next time as we delve into amazing properties of an often overlooked part of JMP.