Cause and Effect Diagram: The Hidden Champion for Visualizing Complex Structures

Analytical programs need data in a strict row and column structure, and rows are usually treated as independent observations. The cause and effect diagram puts rows into a hierarchy, described by pairs of parent-child relationships. Originally intended to document brainstorming results in quality management, this platform is a powerful tool to visualize other structures as well.

Generic data structures in JSL scripting are associative arrays and lists. Their elements can be values, as well as lists and associative arrays, providing a framework for quickly and efficiently managing complex data structures. Well-established in system communication, JSON interfaces are another example for hierarchical data structures. If items are treated in different ways and varying parameters are measured after treatment, the actual combinations can be documented in the same way.

In this presentation, a JSL function is presented (and provided) as it crawls through a list or an array and puts all the content into a data table. The cause and effect diagram displays the content in a graphical way. Application examples illustrate the versatility and power of this concept.

Welcome, and you are invited to follow me into the abyss of deep hierarchical structures. Let's start first with a data table. This is the layout that we want to see. A complete grid, no empty spaces, no missing values. But when we look at this data table, we see that for layout and wafer ID, we do have replicated values. This points to a hierarchy.

Let's start a little bit easier and use JSL, the JMP scripting language, to understand what a hierarchy and structures are. This is a list. Lists are in curly brackets, and their elements are assessed by position. The second position here is my last name, Heinen.

This is the same content but as an associative array. Associative arrays come in square brackets and their elements have names. This element is called fn for first name, and the data is bound. The content of first name is bound.

You have two different ways to build structures for data. These structures can be stacked in themselves and lead to combination transformations like these. You see a lot of curly and squared brackets, and at the end, you see there are two square brackets and the curly brackets that are closed. It's quite difficult to find out what is where in a structure like this.

But this is important to know because especially, for example, in so-called REST interfaces where computers communicate with internet services, data is exchanged in JSON structures, and the JSON structure is exactly such a combination of lists and associative arrays. It's important to find out what is where.

To do this manually is a little bit clumsy. There is a program that does that for you. I wrote this, I make this available. You don't need to understand a lot about programming. There are only two important statements here, For and Recurse.

For means this is a loop. Everything in these brackets is repeated several times. Recurse is an interesting thing. It tells JMP, keep in mind where we are in the program and run the whole application again. Then after you're done, come back to this place and go on. Recursion by its nature, can be repeated arbitrary number of times as well.

What does this program do? It takes the whole structure in the beginning. It asks, are your list or associative array? If yes, we assign this role of parent. When we are thinking about hierarchies, we can think about generations, grandparent, parent, children, grandchildren, and so on. This is the first parental generation.

We take the first element, the first child of them, increase the stack level, so we keep in mind we are now in generation two, and then we start the whole questioning again until we come to an end. The end is always a data element.

Then we keep in mind, we save all the information of this data element, decrease the stack level, so we go one generation back. We are in the grandchildren generation, now we go one generation back and ask, do you have siblings in that generation? If yes, we take the next sibling and the whole process begins again until there is no sibling anymore, and that happens in the end at the top position, so at the grandparental generation. Then we create the output.

The output in this case, or here's an example, in inter-lap comparisons, parts, standardized parts are sent to labs where they are tested with different analyzers, and they spit out different groups of measurement values.

That data is analyzed then and displayed in this table. We see there is a parent and child column, and I colored this a little bit to make it easier to see what happened. This is the first path down to the first data element, then all its siblings, one level up, and there we find that there is already the end, no further children, and then one level up again and the next structure element.

Because we saved the data as well, we know for every step what we are talking about. About a lab, about a part, about a group of measurement values of the lab analysis. Here at the bottom, you have a value that is called cv in this specific case, for coefficient of variation.

What do we do with a data table like this? We want to make this diagram. This is in the Analyze menu, Analyze, Quality and Process, Diagram. I did this before, so I can fill this out automatically. It's really no rocket science.

The default is the fishbone diagram. Right at the end, we have the whole set here. This is difficult to read. With a right click, a right click here, we can change that type to hierarchy. Now we see we have a hierarchy of up to seven levels, and here begins the whole set and so on. We have associative arrays. We have lists as well.

How do we navigate in these groups? We start with the whole set, of course. Then we go one level down an associative array of the parts, then a level further down the first lap. You see the line goes way to the right, so there are many or some more laps. We take one of the analyzers, E2 in this case. We have this group of measurement values is a list, and the third element in the list is the data point with a value six.

This is exactly how a programmer would access that data point. This is much easier to find out and to follow than finding your way through a structure like this. We have a program that does the analysis. We have applications, JSON interface, REST interfaces, JSON structures, where this is very helpful.

Everything perfect? Well, the script assumes that the structure is already a JSL structure. It would be perhaps nice if there was a user interface where a user can say, \!"This is my data table. This is the hierarchy of my columns. Please generate a graph like this.\!" Then the program would create that program.

When you are working with this diagram, you may recognize that it is not 100% integrated in JMP like modern platforms are. It seems like it roots way back down in version JMP 2 or 3. If there's someone from JMP development listening, please recondition that platform.

You can get that script from my Discovery presentation web page. If you are curious, if you have questions, if you want to see me in the Discovery Summit on Thursday, March 7, you find me in the Ballroom at Display 2. I'll be there.

Presented At Discovery Summit Europe 2024

Presenter

Bernd Heinen

Skill level

Intermediate

Beginner
Intermediate
Advanced

Cause and Effect Diagram: The Hidden Champion for Visualizing Complex Structures

Presenter

Skill level

Files

Automation and Scripting

Data Exploration and Visualization