Catching Typos with Test Data: Using JMP to Identify Serial Number Errors (2025-...

In a high-mix, low-volume semiconductor manufacturing facility, devices are tested multiple times and at multiple temperatures throughout the process and tracked using manually entered serial numbers. While essential for traceability, manual data entry introduces the risk of transcription errors – leading to inconsistencies in device histories and potential misinterpretation of test results. This presentation showcases a real-world application of JMP to flag serial number entry errors by analyzing device behavior patterns across test stages.

Using electrical test data with thousands of measurements per device, we demonstrate a structured approach for identifying potential mismatches. We begin with data preparation, including z-score standardization, to bring measurements across temperature conditions onto a common scale. Next, Predictor Screening is used to isolate the most informative measurements, followed by multivariate analysis to reduce the parameter list to only the most representative predictors. This set of variables forms a behavioral fingerprint for each device. We then apply variable clustering to observe when serial numbers group tightly – and when they don’t.

By visualizing and clustering devices based on this fingerprint, we identify cases where two different serial numbers exhibit near-identical electrical behavior – suggesting a likely entry error. JMP’s built-in formulas, Predictor Screening, multivariate analysis, and clustering tools are used to guide exploration and support evidence-based conclusions.

This approach has been used to identify and resolve data integrity issues. This presentation targets engineers, analysts, and quality professionals seeking to apply analytics to real-world manufacturing challenges – without losing connection to the physical meaning of the data.

Thank you for joining me. This is Catching Typos with test data. My name is Nathan Gambles, and I am a Principal Product Engineer at Frontgrade Technologies. My role at Frontgrade is, I help in the manufacture and test of high-reliability integrated circuits, where our customer focus is on medical applications, aerospace applications, military, and any other situations where we need electronics that deliver a high sense of a safety.

Today, I am going to share with you a story of a root cause analysis investigation that I was part of, where we were seeing some interesting anomalies in the data, and it turned out that we were having some typos causing issues in our test results. I'm going to share with you the process that we used to identify and further investigate those situations.

I'm going to start with a little bit of background into what exactly it is I was doing. Like I said, I was manufacturing integrated circuits. I've got an example of an integrated circuit on screen here. Many people refer to these as computer chips. The thing about the integrated circuits, they come in all different shapes, sizes, and pin counts. Many of them have many, many different pins that do various different things. When we're testing these parts, we end up having to do the same test on many, many of the different pins. That is something that we will see in the test data when we get to that in just a minute.

The other thing that I want to note about the integrated circuits is the mark on top. Generally, these are going to have a mark that say who made it and what the product is and what lot number it came from, or maybe it's got a Lot Date Code. Sometimes they have serial numbers. Most of the products that I'm working on do have serial numbers, and that is how we track the serial numbers through our manufacturing process. The serial number is going to be important as we move through this story here.

When we manufacture the parts, the facility I work at is a high mix, low volume facility. The lots that we make. We don't make too many parts all at the same time, but from any given day, we might be making many different products.

When we make a lot, we organize them. I'm going to describe what we do when we get to a test operation. When we get to a test operation, we load the test program on our automated test equipment, and we have an operator that after the test program is loaded, they take a part, they put it… Sorry. They take a part, they look at it, they read the mark that's on the part, they put that part in the test fixture, they enter the serial number of the part, they hit test. The test is executed, the data is collected. When it's complete, the operator takes the part out of the test fixture, puts it back in the tray, and repeats that process for all of the parts in the lot.

Taking a step back, so I just described the general operation that we might do at a test operation. The larger flow that we do is that we test the parts at room temperature. We then take the parts, we stress them. We then test the parts again at room temperature, and then we test them at cold and hot. This is going to be the minimum and maximum operation temperature per the specification.

We follow that testing up by testing the parts again at room and then again at cold and hot. We do the second round of testing as part of our quality checks to ensure that we don't have an excessive number of failures getting past our first round of screening here at operation two and three. The story, the root cause investigation that I was part of, was spawned because we were seeing a higher than desired failure rate at test operation four and five here, where we didn't expect too many failures.

Many different branches of the investigation went off in different directions. I'm going to focus on the one branch of the investigation where we had… We'd taken the data from some of the failures that we were seeing at these test steps, and we dove into the data a little bit. Typically, what we would see in the data is for a given measured parameter, if we looked at the data at cold, we would see that for a given device, if it was on the high end of the distribution at cold, It would be on the high end of the distribution at room. It would also be on the high end of the distribution at hot. Similarly, the same situation would exist for a given device if it was on the low end or if it was in the middle.

Some of the devices that were failing later in the process didn't follow that pattern. They had a pattern where sometimes some of the data said that they were on the high end, then other data said they were on the low end, and then some other data said they were on the high end again. This didn't really make sense. This doesn't make sense from a device physics point of view for semiconductors.

After some brainstorming, it was decided that the most likely root cause of what was causing this data that we were seeing was that maybe we aren't actually looking at the data from the devices we were expecting, and maybe we are getting some typos mixed in with our data, causing this confusion.

As we progress through this investigation, we wanted to identify an analysis technique where we could take the data from any of our parts. Again, we got a high mix environment. Get our data. We wanted to make sure it didn't require too much subject matter expertise to perform. We wanted a procedure that was fairly easy and one that was fairly straightforward to review the results. That was our goal moving through this investigation.

That's the background of what we're doing, why we're doing it. The example data that I'm going to be using is available in the JMP journal here. If you click that button, it will generate you a data set. I've already got this data set open. I would first like to just walk through what the data is we have an understanding of that.

This is data. I've got 441 rows of data in this data table. Each row, I've got an entry for a lot at a test operation. This matches the PowerPoint we just looked over. What temperature the testing was occurring at and what serial number was entered when the operator entered the serial number during the test operation. This data set also has 300 columns of measured values. Each row in this data table is basically one execution of the test program. We can see that we've got a large number of tests, 300 of them.

I mentioned that we have multiple different pins. We can see that we have measurements for test A performed on pin one. Test A got measurements for pin two. If I just scroll really quickly, we can see that we perform many tests on many different pins. We have other tests that aren't done on a pin by pin basis. Some of the tests don't have that behavior. Many of them do. We've basically got a whole alphabet of tests in here, giving us 300 columns of measured values.

One thing I want to point out is that we've got the data table and the data is stacked. Some rows have room temp data, some rows have cold data, some rows have hot data. That's something that when we're doing the data preparation, I'm going to be using some group by functionality in JMP to make sure that the statistics that we're calculating doesn't take into account the differences that we might see between some of these temperatures.

Doing a quick review of some of the data. If I pop open a Graph Builder here. If we look at this chart, I have which operation we're at, which temperature we're collecting data at, and I've got all of the serial numbers that were tested. On the Y-axis, I've got the measured values, and we can see that I've got data at room temp that's right about here. At cold, it shifts down to a different value. At hot temperature, it shifts again. Back to room, back to cold, back to hot.

Shifts in measured values is quite common in semiconductors. Excuse me. Shifts are quite common in the semiconductor data collection. If I scroll through many of these tests, we can see that we're doing the same test on many of these pens. We end up with a lot of highly correlated parameters. If I just keep going, a lot of these tests look very similar to each other. If I just jump down to some other tests that might look different, we've got other tests where we've got different-looking data. Some of the tests have different responses to temperature.

Here's an example. Some of the measured values do have some fairly extreme outliers. It is not uncommon in semiconductors that if a device has a defect and it fails, when it fails, it fails big. Taking care of outliers is definitely something that we have to do in our process. We see that we've got shifts in the measured values because of temperature. Again, we're going to have to use some of the group by functionality in JMP to deal with this since our data is grouped in columns.

A real quick look at the distribution of our serial numbers, it shows that we've got 65 different unique serial number entries in this data set. We've got 65 serial numbers in this sample set of data. We've explored the data. We have some sense of what we're looking at. The next thing, we're going to start doing the process that we identified works well for helping to identify typos in our data.

Our first step is we have to do some preparation on the data where we're going to remove outliers using explore outliers. We're also going to calculate a standardized Z-score using formula columns. To do that, we're going to use analyze, screening, explore outliers. We're going to grab all 300 of our measured values, and we are going to do this analysis by temperature.

I'm interested in using the quantile range outlier for this. Just as a note throughout this entire presentation, I'm using the JMP defaults for all of the platforms I'll be using. I'm going to use the quantile range outliers with the default settings. At cold, in order to clear the outliers in the table, I'm going to use the make a formula column option.

I'm going to leave the default settings there. I'm going to move on to room. For room, I'm going to do the same thing, formula column. For hot, I'm going to do the same thing, formula column. Now, we have generated a number of formula columns in our table. If I do a little bit of cleanup here, we have 300 new columns of data where outliers have been removed.

The next thing I want to do is, I want to set up operation and temperature to be used as a group by… Because I'm going to use the new formula column feature, but I want to get the group by functionality that we can get by setting this so that the statistics are calculated by temperature, by operation for all 300 of our outlier parameters.

With all 300 of my columns that we've cleared out the outliers for, if I right-click here under the new formula column, under distributional, we have standardize, which will calculate the standardized Z-score for all of our parameters. That creates a whole new group of columns for us. If I do a little bit of cleanup here, we now have 300 columns of standardized data.

Just to get a quick visual on what we have just accomplished, if I open this Graph Builder, looking at test E8. Previously, we had an extreme outlier, and it made it hard to see the data. After we had removed the outliers, we were able to see the data, but we continued to have jumps between the measured values between the different operations and the different temperatures. After we standardized it by operation and temperature, we have gotten each of those distributions centered at zero and given them all a standard deviation of one, which means all of the data pretty much falls in between plus or minus three on this.

Got the wrong spot on this scale. Now we're able to ask questions like, were these points on the high end of the distribution at that operation and temperature? Were they on the low end of the distribution at that operation and temperature? Or were they in the middle of the distribution at that operation and temperature? That is what we've done in the data preparation step.

The next thing we want to do is identify which of these 300 parameters are good at predicting serial number. To do that, we're going to use predictor screening. I know from the data, I've got a whole bunch of parameters that are highly correlated with each other. I know that the results of the predictor screening are going to come up with some highly correlated parameters. After we use predictor screening, we're going to use cluster variables to help get rid of many of the highly correlated parameters and just keep the most representative of the tests that we're looking for.

To do that, we are going to use screening, predictor screening. We're going to throw all of our standardized columns in there, and we are going to ask it to figure out serial number. Again, there's some options down here. I used the default settings throughout this presentation. JMP is going to process this data and give us a report where it puts in rank order which parameters are best at describing the serial number of the device.

Taking a quick look down this, it looks like this first parameter is really good, followed by several that seem pretty good. If we scroll down here, it looks like after, I don't know, after about 20, it seems like we're not getting much benefit by including new ones. I'm going to select the best 20 parameters here. Those have been selected in the data table. Now under Analyze Clustering, I'm going to use cluster variables to help identify highly correlated variables in this list of 20.

If I put those 20 in the Y columns, we get this report here where I generally look for bright red and bright blue. I play the game where I count the squares. It looks like there's 1, 2, 3, 4, 5, 6. Coming out of this analysis, I'm expecting to have about six parameters that we're going to be able to use for our typo check.

Before I move on here, I'm going to make sure that my column selection, I don't have any columns selected so that when I select them in this report, they'll be selected. From this clustering, down here in this cluster summary, it looks like it took all 20 of our parameters, and it said, "I can turn these into four groups, and here are the most representative from each group."

That's great. But I found that this platform, it really likes to cluster things. It looks like I have two parameters here that really don't correlate well with none of the other 20 parameters. Those did get added to some of these other groups here, and I don't really want that to be the case. Not only do I look at what the most representative is, but I also look down into this column here, and I make sure that if a parameter ends up in that cluster, that it correlates well within that cluster.

Looking down here, we see a lot of these are values very near one. It means they highly correlate. But then we have this guy. He's very near zero. He doesn't correlate well with the group that he got clustered in. He doesn't really cluster well with the next nearest cluster either. He doesn't cluster well with anything. I do want to keep that parameter in the mix as well. Then looking down, it looks like there's one more that doesn't correlate well with its group either. We have these four plus these two gets me to the six I was expecting based on the picture.

That gives us six columns selected in our data table. If I do a little bit of column arrangement, if I ungroup these, group them together. This gives us a list of what I'm calling our best predictors that we have for trying to understand our serial numbers based on the data that we have.

Now, we've identified our important parameters. Now we're ready to actually do the analysis that should inform us if we think we have serial number typos or not. We're going to use the hierarchical clustering platform for that. Under analyze clustering, hierarchical clustering, we're going to use our six best predictors that we just identified, and we're going to label with serial number.

Again, there are more options. I just stayed with the JMP defaults for everything. This gives us the report for the hierarchical clustering. I'd like to make a few changes here where I set color clusters makes it a little bit easier to see the different groups. I like to control how many clusters get made. When we reviewed the data initially, there were 65 different serial numbers in this lot. I have found that this procedure works best if you don't set it quite at 65, but go just a little bit less than the total number of devices that you expect in the lot. For this example, I'm going to put it at something… I'm not going to put it at 65. I'm going to put it at 58.

The last thing I need to get this analysis going is, I'm going to get a data filter from the data table where I'm going to set it on serial. This will allow me to select all the rows where serial is, for example, one. If I do that, we can see over here in the dendrogram report that the rows of data related to serial number one all clustered together. They're all the same color, and they're all vertically next to each other.

That suggests that our clustering algorithm here has been successful with the six parameters that we gave it. If we just walk through, looks like serial number two also clustered well, that one clustered well. If we just walk through the data, each serial number, watching how well these clusters… We get to serial number 17 here, where it looks like most of our data is down here, and at least one of our data points, it looks like, is up there.

There is a pretty significant vertical distance in between these. But what's more important than the vertical distance between them is if you were to draw a line that follows these lines that lets you connect the points you're looking at. It looks like I'd have to draw this line. Then the question is, how far to the right did you have to look? We had to go pretty far to the right to make this connection. The farther to the right you have to go, the more likely it is that these data don't really look like these data.

Based on my understanding of this dendrogram, I would say, these data look quite different. If I zoom in on this point that seems all by itself, it looks like we have an instance where, based on the six parameters we had, when they were tested, most of the time, devices that looked like that based on their electrical test data, the serial number that was entered was serial number 12, except for one time where the serial number that was entered at test was 17.

This is likely a case where the mark on the part was misread. It seems pretty likely that depending on the font and the size of the mark, a two could look like a seven. That is one of the common ways that we have seen typos show up in our data. This method has just identified. Yes, we have been able to identify typos in this data set using this method.

Moving on, selecting all the points where we match on device serial number 18. It looks like we have a little bit of distance in between some of this. Zooming in on the data, if we play that same game, draw the line between the points, looks like I can connect those there.

Then the question is, how far to the right did I have to go? I didn't have to the right very far at all. It seems likely that all of this data looks very similar to each other. This seems like a situation where device 18 and device 29, they look pretty much the same, which has caused a little bit of separation for the entries for device 18. This doesn't seem like a typo to me. It just seems like an instance where two parts just happen to look similar. This is definitely not something that I would spend time investigating further.

If we continue on through a few more of these serial numbers. These are clustering well, and we get to 25 here. Looks like we've got a separation here. If we play the game. Looks like we had to come a decent distance to the right, so these data don't look very similar. If we zoom in on the single selected point. Zoom in better. It looks like we've got a situation where most of the time, parts that look like this when tested get a serial number of 26, but we have one instance where we got a 25.

If we zoom in on the other selected data points, 25 usually looks like this, except… It looks like we have one instance where an entry of 26 occurred. If we keep an eye on what's selected in the dendrogram here, and I jump between serial number 25 and 26, it looks like… This is an instance that we have identified where we had two devices, device 25 and 26. They both got tested. But when device 25 got tested, it was tested with serial number 26. When 26 got tested, the serial number that was entered was 25.

Something about these, they got out of order or something that caused this type of typo in our test data. This is another type of typo that we have identified using this methodology. We could continue going through the rest of these serial numbers. That's what we would do if we were doing this whole thing. But for the sake of time, those are the only ones I'm going to go through.

As a process, there was a question, can we come up with a simple way that doesn't require too much subject matter expertise of analyzing the data to see maybe, do we have typos or do we not have typos? This process that we just walked through, I feel we were pretty successful at identifying a process that was easy to follow. We don't have a lot of subject matter expertise on what any of these test parameters were, but we were still able to walk through this process and got to at least a short list of possible typos that we can take and discuss with the subject-matter experts.

The process is simple enough that we've had good experiences getting experienced engineers to use it as well as new engineers. It used all just built-in JMP features, and we even used all the default parameters, which makes it easy to automate and script. That is what we did in our environment. They don't have to do it manually, but all of the engineers go through and understand what the process that I just walked through with you here today is.

It was successful. We gained insights. We were able to identify typos in our test data, and we've used those insights to identify some systematic, this type of typo happens often. What causes it? This one happens pretty often. What causes that? We've been able to make some improvements to processes systems. It's helped us improve training as well as, it's been quite a useful tool to have discussions and provide feedback to the test operators on the things that we can see and the problems that we're having. When they are more informed, they are more able to keep an eye on that and provide input on changes and how we can continue to do things better and continue making a higher quality product coming out of our facility.

That is the process that we've developed and used. If you have data where you have longitudinal data, where you have tested a device or a product multiple times at the same conditions or approximately the same conditions, perhaps some of these methods could help you identify possible issues in your data as well. Thank you for joining me. Hope you found this useful. Have a nice day.

Presented At Discovery Summit 2025

Presenter

Nathan Gambles

Skill level

Intermediate

Beginner
Intermediate
Advanced

Files

Catching_Typos_with_Test_Data.zip

Catching Typos with Test Data: Using JMP to Identify Serial Number Errors (2025-US-30MP-2542)

Presenter

Skill level

Files

Data Exploration and Visualization

Quality and Process Engineering