The 1st thing I wish someone had told me when I started using JMP: We are visual creatures
Jun 19, 2019 10:31 AM
I asked myself: What do I wish someone had told me when I first got JMP?
Let me take you back a few years to when I started my journey with JMP, and I don't mean JMP the company, but JMP the software. (And, before you start questioning my need to wax nostalgic on the JMP Blog, I do have a point! I promise!)
I was a newly minted PhD starting my first real-world job. I knew from some lab-mates in grad school who had done industrial internships that JMP was the lingua franca of the semiconductor industry, so it wasn't surprising when my manager suggested that I learn the software.
I started when the company was relatively young and hadn't fleshed out its training organization, so there wasn't much in the way of JMP training. As a result of this, I ended up learning JMP on my own – by trial, error, and pestering JMP Technical Support (seriously, you should see my ticket track from back then... I pulled it up recently, it's epic).
OK, so why this fit of nostalgia? Well, a few weeks back, I had a couple of discussions that made me think about what I wish someone had told me when I first started up the software. You know, assume that there's no New User Welcome Kit or JMP User Community. Assume that I don't know who my JMP Systems Engineer is, how to contact Tech Support or anything about the software. Assume that I've figured out how to open JMP and load up a data set, and that's about it. The younger me is staring at a spreadsheet-like window, and something called JMP Home. What would I sit down and tell him? Thinking back and reflecting on the people I've since trained in JMP, I came up with five things. So, in case my younger self is out there somewhere, here are the five things I wish someone had told me the day I first got JMP.
We Are Visual Creatures
This first thing is more conceptual, but it requires a significant shift in mindset for some – particularly coders. We are visual creatures – think about the amount of real estate on our faces devoted to visual data collection (our eyes). By accident or design, we are geared to process visual information extremely quickly – almost to a fault. We need to visualize data to understand it.
The visualization step is critical to doing analytics because people are truly awful at reading tables of numbers. I'm sorry, but there it is. If the table has more than a few lines, you lose almost any hope of finding trends or data quality issues. I was recently reading a book on data science for Python – guess where they started? With tables, four chapters on tables, modeling, and summary statistics before they got to how to graph a data set, which was covered in one (very short) chapter.
The issue is not unique to Python. I ran into the same thing in other the analytics languages I've studied (even LaTeX!). "Tables first" has been drilled into analytics people from Day 1 for years. And, it is wrong. I'm told it's because graphics are hard to code, but it's not just an issue of convenience – it's a blunder. Let me illustrate this point.
The table below shows statistics for a linear model (y = mx + b) for a collection of data sets. The data provide nearly identical (and equally bad) linear models, identical fit statistics, and identical descriptive statistics. If you were to look at the underlying raw data tables, you wouldn't be able to see anything out of the ordinary – which couldn't be further from the truth.
A Graph Builder plot for the underlying data from the table is below. It's interactive, so feel free to play with the filters on the left. Did you find the Easter egg (or rather dino egg)? The data collection is a variation on the Anscombe's Quartet. It was designed to help demonstrate the importance of visualization before modeling or analysis. (The reports shown here can be viewed over in JMP Public and the original data came from here.)
JMP is designed to help you avoid this kind of "analysis before visualization" blunder. Graph Builder makes it easy to visualize your data before you start doing what most people think of as analysis. Most of the reports in the software lead off with graphics. JMP is a tool that facilitates statistical analysis with graphics and visual cues to help guide users to potentially valuable information.
The software takes this idea a step further by making the graphics interactive – allowing the user to explore complex relationships through multiple charts linked to the data table. In short, JMP is designed to make you use a workflow that starts with building a graph first and asking questions later, which is exactly what you should do. Coincidentally, this leads me to the second thing I wish I someone had told me – which is where we are going to pick up next time.
... What?! Seriously, sorry for the cliffhanger. I initially planned for this to be a single blog post. After all, these five things I wish someone had told me were originally shared in a one-hour seminar. But, once I finished writing it all down, I realized it had turned into a bit of a monster to read in one sitting. The JMP Blog editors were gracious enough to allow me to break it up into a series. So, we're breaking it all up into bite-sized chunks. You're welcome.