cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Submit your abstract to the call for content for Discovery Summit Americas by April 23. Selected abstracts will be presented at Discovery Summit, Oct. 21- 24.
Discovery is online this week, April 16 and 18. Join us for these exciting interactive sessions.
Choose Language Hide Translation Bar
Working with graphics segments and how to create spider charts in JMP

The Engineering Mailbag

Episode 5: I. DO. NOT. LIKE. SPIDERS!!! (And spider plots can take a hike, too.)

I can't ignore spider plots forever, so here we go...I can't ignore spider plots forever, so here we go...

Every now and again, we systems engineers run into interesting questions that would fall somewhat outside the typical range of JMP usage. These applications are generally clever, often bringing home how using data isn’t just for business or technical problems. Other times, the questions are just unexpected, challenging problems. These “curve balls” (as I like to call them) come in many different forms: coding problems, interesting analyses, ways of visualizing data…you get the idea.

Occasionally, I’ll get questions about problems that I really don’t want to answer. Today's query is in that little pile, and I have avoided this one for years! The subject of this entry in The Mail Bag is generally regarded as a scary, unpleasant plot by the data viz community, so I guess it’s appropriate for a Halloween post.

Since this type of plot shows up regularly for cosmetics and consumer analytics customers in my day-to-day work, I can't ignore it forever. So, I’m taking this as an opportunity to help out a number of users groups in New York and New Jersey. But I have to say…I really do not like spider plots.

The question

Like I said, I’ve received this question more often than I care to mention. So, I’m going to skip the actual question email, you'll just have to trust me that there have been several.

There were also some in-person requests.

And a few posts on the JMP Community.

Yeah.

Below is an example from a peer-reviewed paper that someone gave me as a reference for what they would like to see:

(J. Inst. Brew. 2012; 118: 325–333)   Author's note to any data visualization specialist that may be reading this: Please forgive me. I know this is an awful graph.(J. Inst. Brew. 2012; 118: 325–333) Author's note to any data visualization specialist that may be reading this: Please forgive me. I know this is an awful graph.

The response

My general response for these requests historically falls along the lines of what most data visualization experts have said, which is to discourage people from using this plot if at all possible. However, recently, I started thinking about this persistent question a little differently. I’ve since taken a different tack on the problem.

My current philosophy about spider plots is that I’m open to helping people see how to make one if I can also show them better ways to present the same information along the way. As was the case with the other visualization question I answered, for me, the process of making this chart was much more instructive than the actual result.

I. HATE. SPIDERS.

OK, I have (or rather, had) a strong antipathy for spiders for many years. I wouldn’t quite call it a full-blown case of arachnophobia, but for many years I would dispatch anything with eight legs (and occasionally six; you can never be too sure!) with extreme prejudice. I gave no mercy. I gave no quarter. I. Took. No. Prisoners. Anyway, I eventually got over it. And, with the exception of ticks (vampire spiders!), I have made peace with my eight-legged friends. We generally just stay out of each other’s way now. I’d like to think that part of my aversion to spider plots came from my issue with spiders, but it could also be that they’re a royal pain and I instinctively knew that making one would be a bit scary. That’s not to say there aren’t some interesting coding problems involved in constructing them, particularly if you want them to have the interactivity users expect from JMP (just that I knew the code itself was going to be a bit intimidating). Anyway, let’s have a look at the problem.

Scoping the problem

Practicing what I preach, I started with a problem statement:

I want to make a JMP add-in that displays a spider plot and other possible ways of visualizing a data set in a framework that I can add on to later.

The next step is to create a punch list of the different things that I need to do:

  • Get the dimensions for graphing from the user.
  • Get a label column from the user for each data subset.
  • Construct an interactive spider chart for the provided dimensions by each subset.
  • Construct an interactive radial chart for the provided dimensions by each subset.
  • Construct an interactive parallel chart for the provided dimensions by each subset.
  • Construct an interactive table of pairwise correlations for the provided dimensions.

Scoping the problem revealed some good news, specifically, I can offload a lot of this stuff to JMP if I use Application Builder, which will also make it really easy to add new visualizations later if needed. The challenging bits were mostly around the first two charts and making them interactive. JMP’s graphs work in Cartesian coordinate systems (x and y coordinates). Those graphs are, for all intents and purposes, in a polar coordinate system (r and φ coordinates). As a result, I am going to need some code to convert back and forth between Cartesian and polar coordinate systems. Let’s start there.  

Coordinate transform functions

Here’s the function that I used for working between the two coordinate systems:

polarToRect = Function( {r, t},
            Eval List( {r Cosine( t ), r Sine( t )} )
);

It’s fairly simple. Just what you’d find from a geometry textbook. It is a pretty nice little bit of code, in that it takes radius (r) and angle (t) and returns a coordinate pair (x,y). BTW, this function is also in an application that I reference later on that’s included in the JMP sample applications. 

Because I’m working in Application Builder, just about everything else I’m going to do will be either a function or an expression. Everything about this project is also spectacularly repetitive, so it will make the iterative parts of my code easier to read if I’m just calling functions.

The easy bits

If you have a look at the final app source, you can see that I’m leveraging some existing parts of JMP: Graph Builder for a parallel plot, and Multivariate for the pairwise correlations. They were created, almost exclusively, using JMP generated code or capabilities in the Application Builder. So, we’ll get them out of the way first.

Because of the way that Graph Builder treats columns in this case, and the fact that I have to be able to handle an arbitrary number of variables, it’s easier for me to build the Graph Builder script as an expression and then append it to the main display tree in the correct place. Here’s the code for that:

drawParallel = Function( {pCol},
            {default Local},
           
            // build the list of variables with the format "X( :col1), X( :col2)" as a string
            // also building a list of plot elements
            colStr = "";
            gElemStr = "";
            For( i = 1, i <= N Items( pCol ), i++,
                        If( i == 1,
                                    colStr = colStr || "X(" || Char( pCol[i] ) || ")";
                                    gElemStr = gElemStr || "X(" || Char( i ) || ")";
                        ,
                                    colStr = colStr || ", X(" || Char( pCol[i] ) || ", Position( 1))";
                                    gElemStr = gElemStr || ", X(" || Char( i ) || ")";
                        )
            );
           
            // Get the grouping variable as a string
            xVarStr = Char( gCol );
           
            // Insert the strings into the main string
            Eval(
                        Parse(
                                    Eval Insert(
                                                "parallelVLB << Append(Graph Builder(
                        Size( 766, 256 ),
                        Show Control Panel( 0 ),
                        Show Legend( 0 ),
                        Variables( ^colStr^, Color( ^xVarStr^ ) ),
                        Elements( Parallel( ^gElemStr^, Legend( 3 ) ) ),
                        SendToReport(
                                    Dispatch(
                                                {},
                                                \!"Graph Builder\!",
                                                OutlineBox,
                                                {Set Title( \!"Parallel Plot\!" ), Image Export Display( Normal )}
                                    ),
                                    Dispatch( {}, \!"X title\!", TextEditBox, {Set Text( \!"Variables\!" )} ),
                                    Dispatch( {}, \!"graph title\!", TextEditBox, {Set Text( \!"\!" )} )
                        )
            ))"
                                    )
                        )
            );
);

Notice that all I’m doing is formatting the variables as strings and plugging them into the right spots in the code and then running it. The end result is a fully interactive parallel plot in my app with very little coding on my part. I tried to script up a parallel plot manually; it worked, but just barely. I don’t recommend it. Just use Graph Builder and Append(). I’m also using the Eval Insert() construct.  It’s a little easier to read than constructing the string using concatenations.

The multivariate plot is just a simple Parameterization of the Multivariate Platform in JMP. I didn’t even have to write code to get this bit! There’s an Advanced Mastering JMP webinar that shows how to do that with just mouse clicks. Again, I’m just doing these bits this way for convenience and because I really didn’t want to spend a huge amount of time on this part of the app. It’s not being lazy – it’s being an efficient coder. JMP gives you the code, so you might as well save yourself some time.

The radial plot

The radial plot is actually a port (with some modifications) of an example app that’s included in JMP’s samples directory. The big modification is that I’m using Marker Seg() instead of Marker(). And, that’s actually an important point. It’s possible to make Marker() interactive using some mouse capture commands and some logic, but Marker Seg() handles all this automatically. You just have to tell it which data table you need it to monitor.

drawFlies = Function( {nP, pCol, obj},
            {default Local},
           
            // Create a list of angles for the spider plot
            th = ((1 :: nP) - 1);
            th = Shape( th, nP, 1 );
            th = 2 * Pi() * th / nP;
           
            //Get the data from the table as a matrix.
            sel_dat = J( N Row( DataTable1 ), N Items( pCol ), 0 );
            For( i = 1, i <= N Items( pCol ), i++,
                        sel_dat[0, i] = pCol[i] << get as matrix
            );
           
            // now get a matrix with two columns and nvars rows to keep track of the min and max of each variable
            min_max = J( nP, 2, 0 );                   // column1: minimum, column2: maximum. row for each variable chosen
            For( i = 1, i <= nP, i++,
                        min_max[i, 1] = Min( sel_dat[0, i] );
                        min_max[i, 2] = Max( sel_dat[0, i] );
            );
           
            // Create some empty matrices for the converted values
            xMat = [];
            yMat = [];
           
            // Convert the data matrix
            For( i = 1, i <= nR, i++,
                        current = sel_dat[i, 0];
                       
                        // for each variable, subtract the minimum and then divide by the range.
                        adj = J( N Row( current ), 1, 0.0 ) + 1.0 * ((current - min_max[0, 1]`) (min_max[0, 2] - min_max[0, 1])`);
                        adj = adj`;
                        total = Sum( adj );
                        {x, y} = polarToRect( adj, th );
                        x = Sum( x ) / total;
                        y = Sum( y ) / total;
                        xMat = xMat |/ x;
                        yMat = yMat |/ y;
            );
           
            // Draw markers
            annotateRad = Expr(
                        // Matrices and book-keeping variables
                        xCoord = xxx;
                        yCoord = yyy;
                        obj[FrameBox( 1 )] << Append Seg( Marker Seg( xxx, yyy, Row States( DataTable1 ) ) );
            );
           
            // Substitute values into the annotation expression
            Substitute Into( annotateRad, Expr( xxx ), xMat, Expr( yyy ), yMat );
            annotateRad;
);

As you can see, A LOT of the code is dealing with the radial transforms, etc.  The interesting bit for this discussion doesn’t happen until you’re almost at the end of the function. (Hint: Look for the Draw Markers comment.) I’m appending a Marker Seg() to the Graph Elements stack. If you right-click on the radial plot in the finished report and select Customize, you can see it in there:

image2.png

The reason that Marker Seg() is so powerful is that it has that Row States() reference in it. Remember, everything in JMP is linked through the data table via row states. So, by telling JMP which data table the report needs to communicate with, I can sync up all the charts in the report and have all the interactivity I’m used to with JMP. And it's just by going from Markers to Marker Segments (which blew my mind when I got it working). 

The spider plot

OK, we’ve got nothing else that I can talk about except that dang spider plot. So, let’s get into it.

Since a spider plot is basically a parallel plot wrapped around central axis, we’re more or less going to have to make a parallel plot – with all the points projected onto the polar coordinate system. And, if that sounds painful, you’re right. It is. It’s also really, really hard for your brain to process radial information. So, that one change – from Cartesian to polar – makes spider charts significantly harder to read. Anyway, I’ll save the discourse on the merits of this graph for another time. On to the code!

The good news is that since I’m really only dealing with one dimension per variable, I have full control over the angular part of the coordinate system. So, there’s some transformation work there, but it’s not that bad. It gets messy when you need to consider that the user might want a wider scale displayed than the data actually covers, e.g., the values are between 3 and 5 but the possible scale is 1 to 7. I was able to handle that by looking for an axis column property and then getting the max and min values from there. Since that’s not a critical point in the narrative, you can see how that was done in the source code I’ve included with the add-in.

The radial spokes and reference lines are just lines drawn using the Line() command. The hardest part of this whole thing was making the webs for each row interactive and getting the legend to work. I made the lines interactive by employing a Line Seg() and a Marker Seg(). The legend was a repurposing of a really old piece of JMP I ran into recently called a Row Legend. Let’s look at each of them.

Here’s the code for creating the lines on the spider chart. As with the radial chart, it takes a lot of data manipulation just to get it into a format that makes sense to graph: 

drawTrails = Function( {nP, pCol, obj},
            {default Local},
           
            // Create a list of angles for the spider plot
            th = ((1 :: nP) - 1);
            th = Shape( th, nP, 1 );
            th = 2 * Pi() * th / nP;
           
            //Get the data from the table as a matrix.
            sel_dat = J( N Row( DataTable1 ), N Items( pCol ), 0 );
            For( i = 1, i <= N Items( pCol ), i++,
                        sel_dat[0, i] = pCol[i] << get as matrix
            );
           
            // now get a matrix with two columns and nvars rows to keep track of the min and max of each variable
            min_max = J( nP, 2, 0 );                   // column1: minimum, column2: maximum. row for each variable chosen
            // Check if the user wants to use axis column property values for max and min.
            If( axisProp == 0,
            // Directly calculate the values from the data table. 
                        For( i = 1, i <= nP, i++,
                                    min_max[i, 1] = Min( sel_dat[0, i] );
                                    min_max[i, 2] = Max( sel_dat[0, i] );
                        )
            ,
            // If the user wants to use axis column property values
                        For( i = 1, i <= nP, i++,
                                    // Pull the column name and check if the axis property is defined
                                    col = pCol[i];
                                    axisFlag = Contains( col << Get Properties List, Expr( Axis ) );
                                    //If the axis property is present pull the values from the property
                                    If( axisFlag == 1,
                                                axisPresent = col << Get Property( "Axis" );
                                                min_max[i, 1] = Eval( Extract Expr( axisPresent, Min( Wild() ) ) );
                                                min_max[i, 2] = Eval( Extract Expr( axisPresent, Max( Wild() ) ) );
                                    ,
                                    // if the property is not present, calculate the values directly from the data
                                                min_max[i, 1] = Min( sel_dat[0, i] );
                                                min_max[i, 2] = Max( sel_dat[0, i] );
                                    );
                        )
            );
           
            // Create some empty matices for the converted values
            xMat = [];
            yMat = [];
           
            // Convert the data matrix
            For( i = 1, i <= nR, i++,
                        current = sel_dat[i, 0];
                       
                        // for each variable, subtract the minimum and then divide by the range.
                        adj = J( N Row( current ), 1, 0.0 ) + 1.0 * ((current - min_max[0, 1]`) (min_max[0, 2] - min_max[0, 1])`);
                        adj = adj`;
                        // an error check form missing values in the vector
                        For( k = 1, k <= N Rows( adj ), k++,
                                    If( Is Missing( adj[k] ),
                                                adj[k] = 0
                                    )
                        );
                        {x, y} = polarToRect( adj, th );
                        xMat = xMat |/ x;
                        yMat = yMat |/ y;
            );
           
            // Draw Lines and Markers for spider plot
            annotateSpider = Expr(
                        // coordinate matrices
                        xMat = xxx;
                        yMat = yyy;
                        nParam = nnn;
                        nPoints = N Rows( xMat );
                        nGroups = nPoints / nParam;
                                   
                                    // Reshape the matrices to make them easier to work with
                        xSMat = Shape( xMat, nGroups );
                        ySMat = Shape( yMat, nGroups );
                       
                        // Create an empty string to hold a list (as a string)
                        pathList = "";
                                   
                        // Draw the lines as polygons using the reshaped matrix
                        For( i = 1, i <= nGroups, i++,
                                               
                                    // Get the coordinates for a given path and add the first value to the end to close the path
                                    xCoord = xSMat[i, 0];
                                    xCoord = xCoord || xCoord[1];
                                    yCoord = ySMat[i, 0];
                                    yCoord = yCoord || yCoord[1];
                                   
                                    //
                                    obj[FrameBox( 1 )] << Append Seg( Line Seg( xCoord, yCoord, Row States( DataTable1, {i} ) ) );
                                               
                        );
                                               
                        // Draw the markers
                        For( i = 1, i <= nP, i++,
                                               
                                    // Get the coordinates for a given path
                                    xCoord = xSMat[0, i];
                                    yCoord = ySMat[0, i];
                                   
                                    // Run Markers Seg (need to move and reformat the matrix)
                                    obj[FrameBox( 1 )] << Append Seg( Marker Seg( xCoord, yCoord, Row States( DataTable1 ) ) );
                        );
            );
           
            // Substitute values into the annotation expression
            Substitute Into( annotateSpider, Expr( xxx ), xMat, Expr( yyy ), yMat, Expr( nnn ), nP );
           
            // Run the annotation expression
            annotateSpider;
);

The important bits are the Line and Marker segments toward the end. I had to draw each connecting line individually. There is a Path Seg() that would have made this really easy, but it creates a filled polygon if you try to color it. So, Lines and Markers it was! By using Graphic Segments, it's really easy to hook back to the data table through the Row States operator. Now, because all the graphs are looking at the row states from the main data table, it becomes possible to color them all simultaneously by assigning colors to each Row State in the data table. 

When you right-click on some of the graphs in JMP, there will be the Row Legend option. It’s been in JMP for a while and does two things. First, it colors the rows in the data table by column (like the Red Triangle Menu option in the data table does). Second, it creates a little legend next to the visualization. That’s all I had to do to get the colors in sync with one another across four graphs. Here’s the code:

drawLegend = Function( {gVar, obj},
            {default Local},
            // extract the column name with the grouping variable
            colName = gVar[1];
           
            // pass the variables into the expression
            annotateLegend = Expr(
                        obj[FrameBox( 1 )] << Row Legend( ggg, Color( 1 ), Continuous Scale( 0 ) )
            );
           
            // substitute into the expression
            Substitute Into( annotateLegend, Expr( ggg ), colName );
           
            // run the expression
            annotateLegend;
);

Since all my graphs are looking at the same data table for row states, they all automatically inherit the coloring the Row Legend assigns! Super slick. 

Wrapping things up

And that’s it. For scripters, the main thing I’d like you to get out of this is the power of the Graphics Segments. A lot of the bits that make JMP so special are wrapped up in what those functions can do. For everyone else, there’s now a spider plot add-in. I’m pretty proud of how it turned out. But, please don’t use it. There are much better ways of visualizing that type of data.

Author's note

No spiders were harmed in the writing of this blog, although some may have taken umbrage at being lumped in with insects (six-legs) or ticks (also arachnids, but they’re vampire spiders, so I’d like to think that even spiders hate them).

Last Modified: Oct 21, 2020 1:44 PM
Comments
mzwald
Staff

I have yet to encounter a customer inquiring about spider plots, and I agree with you on their utility.  Seems like if they are popular enough, there should be an option in the parallel plot to convert it into the spider (shudder) format.  But a nice JSL exercise, well done!

MikeD_Anderson
Staff

Yeah - I seriously doubt we'll make a Spider Plot.  Truthfully, I hope we don't.  It's a bad plot.  But, it was a fun exercise.

P_Bartell
Level VIII

My suspicion regarding the 'demand' for spider charts is that it's a fallout of the more general mindset for many, many JMP users, or those contemplating becoming JMP users comes, from their prejudice that "Excel can do (fill in the blank), so how about JMP?" While I never got many questions asking about spider charts specifically, as a JMP systems engineer like @MikeD_Anderson and @mzwald, I can't think of a week that didn't go by where I didn't get the "I can do (again...fill in the blank) in Excel...how do I do it in JMP?"

 

So sometimes there is an exact duplicate for something Excel does in JMP...but more often than not, there's either an easier way OR better way to make the same data viz/analytics point. So like @MikeD_Anderson my focus for the latter case was always, "Lemme show you an easier (read fewer mouse clicks) more impactful way to make the same points!"

shampton82
Level VII

Spider charts are really useful when visualizing dimensional data on round parts.  For us in the gas turbine business, these are used extensively.  It would be great to see them in JMP someday outside an add on.

 

steve

MikeD_Anderson
Staff

@shampton82 - I'm going to stick to my guns here and say that even though the parts are round, using a radial coordinate system is probably making it harder for you to find issues.  Here's an example:

MikeD_Anderson_0-1604339165973.png

Both plots show the same 15 samples with the same 60 radii measurements collected on each part.  Looking at the radial plot on the right (basically a spider plot without the lines), you might be able to see something out of whack - but it's unlikely.  However just doing a line chart of the exact same data you can see that there are some serious issues between 6 and 36 degrees on one or two parts.  In fact, just doing a normal XbarR control chart ( on an extended data set with the same issues) and paying attention to the variability chart you can see that there a pretty serious problem that looks like it might be systematic:

 

MikeD_Anderson_0-1604339519799.png

The problem really boils down to the fact that even though a part or phenomenon might be "circular" in nature - people do better cognitively with linear representations.

shampton82
Level VII

Hey 

Sticking to your guns eh, now its a challenge!  Ha!  No I won't try and convince you and I agree there are a lot of other ways of viewing the data that are useful as well.  The two notes I'd like to point out is that in your first chart it may not be fair as the spider graphs radial axis seems too zoomed out to make it an accurate comparison.  The other would be the reason they are so useful for round parts is for interpreting the relationship of the dimensions around the part at a glance.  In this example (I now I know its excel, I swear I don't use it much!) it's much easier to interrupt (for me) the pinched in nature of locations 11 and 22(red arrow) vs the line chart.  it also allows easier to visualization on what may move in the opposite direction 90 degs from the pinched in points if I were to try and move them outboard: 

shampton82_0-1604382031041.png

So when your trying to visualize how to repair a round part this can add some clarity without having to think about which points are opposite each other/90 degs of each other as on the line chart.

 

Hopefully this make sense.  I've been using the parallel plots a lot more but the spider plot really shines in certain cases.

 

Steve

MikeD_Anderson
Staff

@shampton82  - why not just use the difference in the radii?

MikeD_Anderson_0-1604420117198.png

 

if you went that route, you could even make the graph more information dense to look for systematic issues: 

MikeD_Anderson_1-1604420376658.png

 

markschahl
Level V

@MikeD_Anderson : nice work! I too hate the Excel radar plots. They are used because of their long-time availability in Excel. My product folks use them for displaying material properties with very different scales for different materials: flexural modulus; tensile modulus; impact strength; softening temperature; etc.. I've convinced some of them to use the vastly superior parallel plot. Ditto for round objects. Bar charts work well for showing over/under dimension. 

Looking at a parallel plot will not strain one's neck like a radar plot..

BTW, this hangs on my office door: Tradition

MikeD_Anderson
Staff

Yeah - pie charts are another painful one... they  can  be useful if used correctly. But they're definitely a weapon of last resort.  

 

M

 

PS - love that poster!

abmayfield
Level VI

As one of the many naggers for this add-in, I sincerely appreciate @MikeD_Anderson for having developed it even despite his dislike of spider graphs. I do agree that, with more than one sample, they are not useful (e.g., the brewing graph above), but I still like them for showcasing properties of a single sample. For instance, I actually like the candy bar example (albeit for a single candy bar, not for comparing two). Screen Shot 2022-02-08 at 10.22.09 PM.pngScreen Shot 2022-02-08 at 10.23.04 PM.png

These two figures in theory show the same information: what lives on the bottom of three different types of coral reefs (exposed, intermediate, and protected reefs). Despite the beauty of the latter (made via the add-in), I do think the former (stacked bar chart in Graphbuilder) is more informative. But for individual corals, I do have some situations where I think the spider plot could shine over the parallel plot or, what I would normally use for data for individuals=heat map (cell plot). Anyway, worth your time and effort and much appreciated!

Screen Shot 2022-02-09 at 8.42.12 AM.pngI have a strong urge to draw eyes and a smile on this.

K_Stan
Level I

Spider (or Radar plots, as they are called in Excel), provide excellent way to visualize patterns. In my analysis patterns are very important and no bar charts (and other charts for that matter) represent data visually better than Radar (Spider) plots. I work in biotechnology and have to visualize data related to cellular marker expression. I specifically moved away from all other charts to Radar plots, because I am able to see patterns in them and organize them in such a way that changes are immediately noticeable and easily attributed to certain biological conditions. That is why I am not using my JMP Pro 17 for that purpose. Much easier and faster to achieve what I need in this respect in Excel.