JMP Blog

XanGregg · Jul 26, 2019 02:08 PM

At JSM 2019, I’m participating in a session on Dynamic Interactive Data Visualization. When I was first invited by the session chair, Blanton Godfrey, to contribute a talk, my first thought was, “Sure, JMP has a lot of dynamic and interactive data visualizations.” But a second thought quickly followed, “What does that really mean? Everything is interactive these days.”

That led me down the path of trying to categorize different methods of interactivity. The lofty goal is to help define a classification of data visualization interactivity that would be useful in comparing products or discussing individual implementations. In this initial effort, I’ve classified nine techniques, along with example videos of the interactions within JMP.

Graph building
Selection linking
Row filtering
Hover detailing
Model tuning
Aesthetic mapping
Column switching
Volume slicing
Axis scaling

I’m using the same data set throughout my example videos. It contains 60+ columns of demographic, economic and election data for 3,000+ US counties. Now on to the methods!

Graph building

Interactive graph building means constructing or restructuring a graph with immediate view updates. In JMP, the interactions are directly on the surface. Other implementations can still be interactive with drop zones off to the side the graph.

(view in My Videos)

The video highlights a couple challenges. When the second Y variable, “Rep Pct,” was added, there were more Y drop zones than before, so we could specify where to put it with respect to the existing Y variable. And the Color and Size drop zones are artificial in that they don’t have the same natural position as the X and Y axis variables do.

This ending chart is much like one featured in a New York Times article titled, “The Most Conservative Counties Are the Ones That Get the Most Government Assistance.” There was some discussion on Twitter about the regression line and other features. That led me to collect and explore the data myself (Bureau of Economic Analysis and Tony McGovern’s election results). When I realized how many interactivity methods I was exercising during the course of my exploration, I thought this would make a suitable demonstration data set.

Selection linking

When there are multiple views using the same data table, selection linking means that elements that are highlighted in one view have their corresponding elements highlighted in the other views. In JMP, selection linking is on all the time, which means it has to be fast even with millions of rows. To achieve that, selection is baked in as a core part of every data table.

(view in My Videos)

The biggest challenge is when graph elements don’t have a one-to-one correspondence with each other, such as the county-level dots and the state-level shapes in the example. Showing partial selection in a clear way for any graph type is an open challenge. For the map, JMP highlights an entire state shape when any of its counties are selected elsewhere. For graph elements that are sized by count, such as histogram bars, JMP will use partial highlighting to reflect partial selection.

Row filtering

Row filtering is one of the most common types of data visualization interaction. The idea is that the visualization only uses a subset of the data table’s rows, based on the constraints of the filters which appear next to the visual.

(view in My Videos)

There is an option called "Lock Scales" that I used in the video to keep the axes fixed while filtering. Otherwise, they would be adjusted for each filtered subset.

Hover detailing

A passive kind of interaction is to hover over a graphic element and have a floating window appear with more details. Sometimes these are called “tooltips” since they were originally used for toolbar icons. In a data visualization, hover detailing is a way to show both more precise values for the charted variables as well as additional related variables.

(view in My Videos)

In the example, notice that county name was included in the hover details for the scatterplot even though that variable is not in the chart. That’s because the variable in the data table is marked with an attribute that causes it to appear in all hover details by default.

Model tuning

Model tuning means that the visual representation changes as a model parameter is changed. In the example, the model is a spline smoother, and the model parameter is the stiffness parameter, lambda.

(view in My Videos)

Though many models are too slow for such interactive tuning, computers are fast enough to do more than we often realize. This simple-looking model has a lot going on. There are four separate smoothers, and each one has a 250x bootstrap confidence interval, so there are 1,000 spline models being fit each time the stiffness parameter changes.

Aesthetic mapping

Mapping aesthetic attributes to graph elements is a simple but important interaction. With aesthetics, such as colors and line styles, you often need to see it to know how it well it works.

(view in My Videos)

Simple attributes can be changed with a right-click, and more complex attributes such as the gradient color mapping require a settings dialog and thus have one less degree of interactivity.

Column switching

In column switching, you replace one or more data columns with another column from a set. Here, the Y variable is replaced one-by-one with columns in the left-hand panel.

(view in My Videos)

Column switching can be a good way to take a quick pass through a new data set.

Volume slicing

Volume slicing is a way of exploring a multi-dimensional data space by graphing a set of two-dimensional slices. The Profiler in JMP uses slicing to explore multi-dimensional models. In this example, we’re modeling Rep Pct, on the Y, against three factors, Personal Income, Personal Transfer and Ballot Rate, which results in a four-dimensional surface. Each frame shows a slice across one X while the other Xs are held fixed.

(view in My Videos)

Personal Transfer seems to have little effect (it’s mostly a flat line in this slice, at least), but as we interact with it and change its fixed value used by the other frames, we see an interaction with Personal Income. Personal Income has a loose positive slope for high Personal Transfer and a strong negative slope for low Personal Transfer values.

Axis scaling

A scale defines how data values are mapped to screen coordinates and is itself visualized as an axis. The axis scaling interaction is the ability to change that mapping by directly manipulating the axis. The interactions include panning, stretching and zooming.

(view in My Videos)

Axis scaling is not always just a matter of redrawing elements at different locations. Notice that the dot plot on the right has to adjust its marker dodging layout for the new scales, and the axes themselves sometimes have to recompute things like tick mark intervals. In the case or geographic visualizations, it’s even useful to switch projections depending on the scale.

(view in My Videos)

For JMP, the regional scale uses an Albers equal area projection, and the world scale uses a Kavrayskiy VII compromise projection.

More interactions?

Given these nine data visualization interaction methods, we can better consider further questions:

What other general interaction methods are useful in data visualization? I’m saying “general” methods since I imagine there is no limit to the number of specialty interactions geared toward a particular kind of graph. For instance, JMP has a way to interactively change histograms bin sizes that I didn't include here as a general method.
Should some of these methods be split into multiple methods? Maybe volume slicing is different for data exploration versus model exploration. I’m considering interactive graph resizing as a kind of axis scaling, but maybe it’s worth having its own category.
Should some of these methods be combined as varieties of the same method? Maybe row filtering and column switching are two varieties of a larger subsetting interaction.
Are there better names for these methods, either pre-existing or new? Some of these names come from JMP terminology, and some are my own invention for this discussion.
Should different levels of each methods be recognized? For instance, should we distinguish between graph building with or without a live preview.

I hope to hear feedback on these questions and others, either here, at JSM or on Twitter.

Phil_Kay · ‎08-16-2019

I had never really thought how many different ways there are to interact with JMP. It is really remarkable, especially when considering all the complex challenges that you highlight. Great post as always. Thanks, Xan.