Subscribe Bookmark
anne_milley

Staff

Joined:

May 28, 2014

JMP 13 Preview: The power of the new Virtual Join

Have you ever wanted to include data in an analysis without having to subset it from different tables and put it all together in a new table? Have you wanted to “see” how your data will come together before committing to joining many tables to make sure you get it right the first time? Soon you can, with Virtual Join!

The new Virtual Join feature in JMP 13 enables you to link a main table with multiple auxiliary tables through some common keys. Linking is done through column properties, specifically, Link ID and Link Reference. It is very automated. Once you have the properties set up, you have access to all the columns from the auxiliary tables.

The linked tables interact together as one all-inclusive data table, with all the columns from all the tables linked in virtually, without actually copying the data into the main table.

Here’s a schema showing the relationships among virtual join tables from JMP tester Mandy Chambers’ Discovery Summit poster titled “Fast, Powerful, Efficient: Joining Without Joining to Explore Summer Games Data with JMP”:

Screen Shot 2016-08-19 at 4.10.04 PMPrincipal developer Chung-Wei Ng explains how the idea for Virtual Join originally came about. Many versions ago of JMP, she was thinking of how to link multiple tables together, looking for a simple way to show related data from different tables. There already were functions in JMP like summary, linked subsets and by-groups, but she was thinking it would be more useful to generalize this.

She started exploring a different paradigm and showed some early thoughts to John Sall, the chief architect of JMP. At that time, he was working on the Choice platform, which can use multiple tables. Their conversation led to a suggestion by John: It would be nice if JMP could automatically link those tables together, so the columns are accessible through the same interface.

The idea languished until last year, when Chung-Wei got a set of related data tables on movie rentals from fellow developer Eric Hill: “When I was playing with the data tables, the idea suddenly struck me ‘Wouldn’t it be cool if I could virtually join those tables together, so all those related data from the different tables are accessible as if they were all in the same data table?’” Seeing how the data are related from those tables gave her an idea of how it should all work.

Join is one of the most-used data manipulation tools in JMP. It can be very memory-intensive, in the actual process of joining, and for the resultant table. Most of the time, you need to join data tables to bring related data into one data table through a set of common keys, so the resultant table can be used in analysis platforms. A lot of those joins can now be replaced with a virtual join.

This saves disk space, memory and time. With Virtual Join, you can keep your data in a simpler form. Related data don’t have to be duplicated in all the tables that may reference them.

Colleagues and early adopters like the functionality and ease of use. Longtime JMP user Cy Wegman, President of SY64, LLC, says: “Table management and manipulation has always been a challenge. The Virtual Join will significantly improve my productivity by decreasing errors and making table management so much cleaner.”

Chung-Wei is very happy to work on something that will be so useful and says, “I know the users will find use for it in ways that I never even dream of now. I just love working on the data table. The Data Filter lets you zoom in on the table; now Virtual Join lets you zoom out.”

To learn more about what's coming next month in JMP 13, visit the preview site. There, you can sign up to watch the live stream of John Sall's speech launching JMP 13 as well as view videos and see a list of new features.

1 Comment
Community Member

philomena wadden wrote:

What a great idea!