Streamlining Manufacturing Excursion Investigations with JMP
Excursions can lead to significant costs at a manufacturing facility. In the semiconductor industry, production downtime, scrapped and low-yield wafers, and decreased output can result in substantial revenue losses. Engineers tasked with investigating these excursions must quickly uncover actionable insights for data-driven decisions that minimize downtime and improve product quality.
JMP's Analytic Workflow can help you outline the steps and tools needed to efficiently investigate excursions. Explore how Query Builder facilitates data access from databases and how JMP's powerful Tables menu aids in data manipulation. Discover optimal table formats for visualizing and analyzing wafer map data. Utilize exploratory and analytical techniques to discover hidden relationships across manufacturing process steps. Enhance your analysis with JMP Pro, leveraging features like image analysis in the new Torch Deep Learning Add-In for JMP Pro 18 to gain advanced insights.
Automate and rerun your entire analysis using Workflow Builder in JMP, ensuring speed, repeatability, flexibility, and analytical power without the need for coding. Learn how leveraging these tools and techniques in JMP and JMP Pro can lead to efficient resolution and substantial cost savings in a fast-paced manufacturing environment.
Note: The Workflow contained in the "Semiconductor Defects Workflow.zip" folder uses CSV import instead of Query Builder, although Query Builder will be used in the presentation.
Hello, everyone. My name is Ryan Cooper, and I am the JMP Systems Engineer for the Gulf Coast Territory. Today we're going to be talking about Streamlining Manufacturing Excursion Investigations with JMP. While this is a semiconductor example, the idea of following a plan like this for excursions is transferable to any industry that does manufacturing. I'll be showing some tips and tricks that may be useful for anything you're doing in JMP, not just excursion investigations. There'll be a little bit for everyone.
With that, let's move along to our excursion here. I'm sorry if you're expecting to leave your manufacturing excursions for the next 30 minutes or so, I'm sorry, we've got another one here. Hopefully, after we go through this one, you'll have some new tools in your JMP toolbox to use on your own. This particular excursion has some mysterious wafer signatures that baffled engineers from the Defect Engineering Department at URA Semiconductor Company. There's four wafer maps here, different signatures that are coming from various places, and we need to figure out where they're coming from.
We're going to do that using various areas of the analytic workflow within JMP. We're going to look at data access first, and then some data blending and cleanup, leading into some data exploration and visualization. We'll focus a little bit on some modeling tools here, but we'll wrap it all together with automation and scripting. The big key here is that there's no programming with the Workflow Builder. We're going to come back to various areas of this workflow throughout the presentation in various spots, and you'll see why. It's a really good way to organize solving a problem in JMP.
The data tables that we need. I always get asked, "How do I get started doing something like this? There's a whole lot of data being collected in my company." my answer to that usually is always have a question in mind. In this case, we're expanding on the question that we asked at the beginning, where are these defects coming from? We want to know what process steps are contributing to abnormal wafer defects signatures on Product A. All these wafer signatures are coming from Product A. Then usually what I do is I try to connect some keywords to different areas of a data table or a database or something like that.
You'll notice wafer die defect data in this table that corresponds to wafer defect process steps. I need process data with parameters like product, so for Product A defects, step, process, things like that. You can see the way that this data is formatted. You'll notice the tall data format. This works well for visualizing wafers. It doesn't work well for prediction and gaining more sophisticated insight from models. In general, problems with predictors that can influence outcomes tend to need to be in the wide format. You'll see this is in a tall format, and we'll be able to visualize the wafers just by using X and Y in Graph Builder.
This is already in a pretty good format. If we want to look into what's causing something, we'll need to get it in more of the wide data format, and there's some ways to do that. Next page here, this is the format that we're going to want this in, in order to answer our question. You'll notice we have our wafer here on the left, individualized, with a die level a full defect data generated graph. We're going to make that and save it to the table, and we'll use it in various parts of our analysis.
Then we've got predictors over here that are generated from process step data. You'll see in this case, we have steps and then the equipment ID. The equipment ID is after the underscore here. We'll be able to figure out what equipment IDs and what steps are causing certain defects here. Our defect total right here is the output. It's just not on the screen. That's our end result. That's the table that we want.
Our analysis plan here. I mentioned we were going to come back to the analytic workflow, and this is where it comes up for the first time. We have data access where we're going to be getting our data using Query Builder, and it's filtering in some SQL. Data blending and cleanup, where we'll be splitting and joining tables. The split is what helps you get into the wide format, and then we need to join the tables together in order to make the relationships to solve the problem. We'll have exploration and visualization here.
Like I mentioned, the die level is in stacked format. We're going to save the graphs to data table, which is a really cool feature that not everyone knows about. We'll visualize our analysis results using the wide format with the graphs that we saved as markers or labels tools for our analysis where we're doing modeling here in the end. We're going to follow this flow here. We're going to do the data access first, visualize that die level data since it doesn't involve any data blending and cleanup. Do a little cleanup to get ourselves into the wide format, and then do the modeling and analysis afterwards.
Another thing that I should note is 80% of solving a problem using analytics is accessing and preparing the data. In this case right here, it's these steps. A lot of times I'll see that this is what turns people off of doing analysis like this. It's pretty easy to be unmotivated, especially if you're having to code or program, which luckily you don't have to do in JMP. We have some great tools for data preparation and stringing it all together with Workflow Builder. We'll see how all this is connected here in a little bit. Our goals by the end of this, what we're going to do. We're going to have actionable information based on data-driven decisions, and we're going to create some reusable knowledge here through Workflow Builder.
This solution is going to be speedy, flexible, and powerful along the entire workflow, so we can come to an efficient resolution where we're reducing downtime, enhancing quality of our wafers and our product, and increasing revenue and cost savings as a result. With that said, we are going to get into the JMP demo. I also want to note that this is meant to be an overview, so don't worry if you don't catch everything. I'll be setting up some things live and running a short script on others at times, just for time. But you can review the video and the workflow that I'll be attaching as a reference for certain steps that you're interested in.
Let's start by looking at… Let's start by looking at Query Builder. This is where we're going to access our data. I'm going to go to File, Database, and Query Builder, and I'm going to look for my defect data in here. I'm going to type in semiconductor. You can see there's a few different semiconductor tables. You might have several different tables, and we'll have to find the right ones. In this case, I'm going to look at the wafer die defect count data and click Next.
You'll get this nice little preview here of the table itself, the columns that are in it. You can get a table snapshot. You can see how it's oriented with the product, the die ID, wafers, and then the number of defects on that die for each chip on the wafer. This is very, very individual analyze data. It's literally one line item for each chip at each step, each defects step. We're scanning for defects. You can do some joining up here if you'd like, but we're going to do all of our joining and data table manipulation within JMP here later. But I'm going to click Build Query right now.
Just to look at what's going on here, I'm going to add product to the list here. You can see that we just added one column, but I can see the distinct rows if I'd like by toggling this on. I'll toggle it off for now. I'm going to add in this order my die ID, my X and Y die, my wafer, and my defects. You can see all that get added right here. I want to filter by Product A, since that's where my issue is. I can filter by Product A. I can also prompt for my filtering when I run this. If you have another product that you want to look at, you can do that throughout the same workflow.
I'm not going to do that in this situation, but just know that's there. Because I'm filtering by Product A, I can also get rid of the product in my columns here. What's happening is it's writing a little bit of SQL so that it's pulling from the database exactly what I tell it to based on my drag and drops. You don't actually have to know the SQL to pull this. With that information, I'm going to run my query and then close out. Here are my results. Now I can take this and go to Graph Builder. We've got our X die. Confer that over an X in my Y die. The default is this smoother, but I can turn that off and make it data points. This looks a whole lot more like a wafer.
In this case, we've got everything together. I want to split this up by the number of wafers. I'm going to wrap this. Now we can see some individual visualize wafers here. There's 125 total, and I want to give it some information about the defects. The way I'm going to do that is pull defects over to Overlay. Now we've got our number of defects on each die or chip on the semiconductor wafer. I don't really like this legend setting right now with the coloring. It doesn't really show chromatically what's going on with the defects.
I'm going to go in here and change under legend settings. I'm going to make the color theme more diverging here. I'm going to go from blue to red. We can take a look at what that looks like right now. I get some of the signatures. They're starting to come up a little bit, but I want a little more contrast here, so I can see the signature is better. I'm going to adjust I'll adjust this slightly on my color theme, create a new one, and take this marker and give me more room for red. Click OK, and now we've got some of these signatures popping a bit more. You've got the top and down defect, the donut, and then the left and right. You've got a left and right, right here.
Those look good to me. I'm going to click Done. This is a little known thing about Graph Builder and what you can do with JMP and send it to a data table. But if you go to the red triangle here, and you make into data table, you now have a data table with the graph icon. We can use this for some of our other information, and it'll make some of our visualizations much, much more interesting. I'm going to close out of this for now.
Now we need to think about what we're going to do with splitting our data table and getting this into a wide format so that we can do some more sophisticated analysis. What we're going to do is we're going to split this die ID so that we have all of our chip IDs or die IDs on the top with wafer on the left. We're going to do that so that we can merge that information with this table here. Let's go to tables and split. I'm going to turn on this preview here, so we can see what's going on.
I want to split by my die ID, and I'm going to split the defects within that die ID. Now I've got 1,423 die on the top and 125 wafers on the left here. In order to see the wafer, I'm going to drag wafer down here to group. Now you can see wafer here in this first column with all of my die ID on the right here. That'll help me be able to merge it with this other table. I'm going to call it something. I've got this pre-made over here. I'm going to just move it. I'm going to click OK.
Now I don't need this anymore, so I'll click out of it. I've got two tables here. I'm going to join this information over here. I'm going to go to Tables and Join. We're going to join Graph Builder table with this other table over here. We're going to match on the wafer. That's our common column, and you can see the preview over here come up. We've got two wafer columns right now because we have in both tables. I'm going to choose just one in order to output. Let's go with wafer and graph first and turn that on.
We have wafer and graph. Now we're going to choose the rest of our die. Now we have all that together. We also give that a name for my output, and I'll click OK. Since I've used those I have two tables, and they're all in this table now, I'm going to click out of these. Now I have this data table here where I can do a couple of different things. I don't have the total number of defects in here or the… I can sum it up, which I'm going to do here in a little bit, but I don't have my process steps yet. That will help us solve the problem.
But I can do some interesting visualizations and modeling now with what I've got now that I have my die as predictors and my wafers over here on the left. First thing I'm going to do is multivariate embedding, which is the new JMP Pro 18 feature. I'm using UMAP. What comes up first here is this graph of UMAP1 and UMAP2. The dimensions have been reduced from all of those chips into two different dimensions, and we're going to look at some relationships here.
You'll notice the black dots don't really tell us anything right now. It just tells us the rows. This is where that graph comes in here. I'm going to click on Graph. I'm going to right-click and say, use for marker. You'll see what happens over here in my multivariate embedding. Now you can see that some of these wafers that have abnormal signatures are being shown over here on the right away from this cluster of wafers that are normal. You can start to see some of those abnormal wafers over here. That's a nice way to visualize maybe some wafer signatures that you may not have known existed.
The next thing I'm going to do is clustering. I'm going to run this separately as well. What this is going to do is use all of my die here to cluster similar wafers together. You'll see it saved this to the data table. I've got each wafer assigned to different clusters, and I've created a Graph Builder with a column switcher called a cluster selector here, where I can go through and see, okay, there's 125 wafers in a model that has one cluster. Then with two clusters, we're starting to see some of those different wafer signatures I get teased out from here.
I can go down along the line and see what this looks like. Cluster 5, I believe, was the most optimum. You can see how you've got most of our left and right defects here, our top and bottoms over here. The big donut one comes up in a cluster of its own. You can do that for the rest of these as well. Moving along to the rest of our problem. We talked about having a table that needed process data in it. I'm going to run a Query Builder separately just for time that gets us that process data. It works the same way as the Query Builder that I did before. We just find the process data instead.
I'm going to run that just to get my table here. Remember, this is now in a tall format. We want this in a wide format so that we can join it to our wafers, our product. What I'm going to do first is combine a couple of these variables just so that they're more informative. You'll see why that's important here after I do the split. But what I'm going to do is, I'm going to add an underscore between the step ID and the process type. We might have several different process types. In this case, it's just equipment, but maybe there's chamber, maybe there's something else that has a different type of ID. We want to concatenate these together in order to separate what's going on at each step for each process type.
I'm going to right-click here and go new formula column and character, and I can concatenate with an underscore. Now these are put together. I'm going to do the same thing for process type. My type ID, concatenate these together with an underscore. Now that I have this information, I can do a split in a similar way to what I did before with my wafer die. I'm going to run that separately as well. Because I've done this, I can click out of this table.
Now I have my wafers, again, 125 of them here, and the different steps and type of step with the ID of that particular equipment, or perhaps it's chamber or something else. This right here could have hundreds of steps, sometimes even thousands. It's not necessarily feasible to just go through and filter through these and see which wafers are common to these steps. What we're going to do is join this to our existing table, sum up the defects, and then figure out which of these steps is actually contributing towards our wafer signatures.
In order to do that, I am going to go back here, and we're going to sum up all of our defects. I can do another little formula trick. Here we go. I've got all of them highlighted. I can actually sum all these up in just a right-click, new formula column, combine and sum. At the very, very end here, what ended up happening is all of this got added together. I have a column that added everything together, and I'm just going to call it Defect Total. Click OK. Next, we're going to do a join similar to what we did before. Again, I'm going to do that in the background. We're just going to have our wafer graph, our total defects, and then our process steps here. Let's do that, and we can X out of these.
Now we have our wafer, our graph, and our steps with our equipment IDs. Over the end here, we have a defect total. This is really our end table result to be able to help us answer the question that we initially set out to do, which is, which of these steps is causing these wafer signatures to be the way that they are? The first thing that I like to do once I get my information in this type of data table format is I go to analyze, screening, and predictor screening. What I can do here is throw all of my steps, there might be hundreds or thousands of them, put them in the X, and then my defect total in the Y.
This is going to run a model to figure out which ones are the most predictive of that defect total. It looks like Step 1, Step 5, and Step 4 might be potentially influential here of my defect total. Let's take a look at this separately. Say, one, four, and five here. I'm going to go to Graph Builder. Remember, because we have the markers on the wafers and the graphs, we'll be able to see what this looks like really quickly.
I'm pulling Step 1 over here. I'm going to make this a little bit bigger. We can see pretty immediately, it looks like equipment B is the one that's most influential here. We've got our top bottom defect here. It looks like we found that one. Let's make this a column switcher. It's got a redo and column switcher. We'll look at one, four, and five, and we're going to add eight here as well. I'll show you why in a second.
Step 4, that was also influential. It looks like we found our left and right defect here with equipment L. Step 5, equipment N here has the donut signature. Then the reason I included Step 8 is because we've only got two wafers here, and it looks like this is where our random defect signature is, but we've only got two wafers for that. It's not showing up through the model yet. We would need more data, more wafers in order for that to show up.
What we've been able to do is figure out what our equipment is that's causing these defects. Let's go back to our PowerPoint and check on our goals and see what we were able to do here. We were able to get actionable information based on data-driven decisions. It looks like we're going to be able to investigate that equipment at each of those steps a little more thoroughly. Maybe there's something else that's causing the issue within that equipment, but we've narrowed it down to that.
We have reusable knowledge. I'll show Workflow Builder here in a second and how everything that I just did was captured in Workflow Builder. Speed, flexibility, and power along the entire workflow. This will be reusable and be able to run it again, perhaps for another situation. Just rewrite a couple steps very, very easily, and then you have an efficient resolution. It didn't take too much time to do this. We're reducing downtime, enhancing quality, and increasing revenue and cost savings.
The big benefits of doing it this way are, one, quick, efficient decision-making. What we just created here didn't take hours to code and weeks and months learning to code. It didn't sit in a systems or program or queue waiting to be implemented. You did it yourself. It came from you, the domain expert. Any modifications needed can be easily made because you're the one who understands the process. Then your analysis is now standard optimized, done correctly and repeatable through automation.
You've recouped time and money by solving manufacturing problems quickly. A little side note here as well. The value in you having access to the data that you're the domain expert for, and you'll be able to recognize things that a database manager may not, or a data science team, or something like that. I've seen push feedback on scientists and engineers having access to data like this. It could break a production database or something like that. But with what can be done by a domain expert, it really is essential for making new discoveries beyond the status as quo and just doing it the way we've always done it.
As a solution, could potentially have a database being mirrored from an actual production database. As a result here, you can look at what having access to this data has done for scientists and engineers' productivity at three different semiconductor companies. Access to this tool has released domain experts from tedious data preparation tasks to instead focus on solving engineering problems that create real business value. For my QE here, quickly collate process and report information that would have taken many hours to compile otherwise. You're saving that time.
Then the last one here I really like because there's some metrics involved. 83% reduction in data processing time, significantly decreased costs, and boosted job satisfaction among R&D engineers by 260%. It's a much more direct way to solve a problem than maybe some other ways that keep job satisfaction maybe not as high. With that, I want to show the workflow and what this looks like when you run it all together.
I'm going to step back everything that I did here. You can see how this is structured. Once again, we're going back to that analytic workflow. I like to group my steps into data access, visualization, data prep, things like that that have to do with the analytic workflow. I've got a Part 1 and a Part 2. You can see all the things that I did here. Initially, I recorded all of this sequentially and organized it in here, so it was repeatable. I'm going to click Play here for the entire thing, and you can see everything pop-up. What we did before, save the graphs to the data table. We've got our multivariate embedding here. I think it's running the model right now for predictor screening.
We got our clustering, actually. At the very end, predictor screening. We've got this graph here at the very end that we were looking at with our steps. Very clearly, we've our equipment B, equipment L, and equipment N that are causing our defects in our signatures here. All of that is reproducible. You can run that again on new data and see what new wafers are going to each equipment here.
With that, I want to show one more thing related to Torch Deep Learning. There's another session on this put on by Russ Wolfinger, where he takes a deeper dive, but I just want to show a couple of the semiconductor examples on surface level here. This is a JMP Pro 18 feature. If you go into add-ins, you'll actually download an add-in separately via the add-in page on the JMP community. If you go to Torch Deep Learning and example data, there's a Torch storybook here. Russ has put together a bunch of different examples for different situations. There happen to be a couple of examples here related to semiconductor wafers and wafer maps.
What Torch Deep Learning is going to enable you to do is do image analysis directly from the image and classify wafers and the defects and signatures that have to do with it. The reason I want to show what this looks like is it looks really similar to what we just did. I'm opening the data table here. You'll notice we've got a picture. We've got a file name that contains the lot ID and the wafer ID, lots, die size, so the number of die within the wafer, then the failure type, so whatever the classification is.
You can train data on these images here to be able to predict what the failure type is. That's really cool. It's the first time that been able to have a drag and drop interface to do an image analysis like this at a deep learning level. It's really, really neat. For more information, I recommend checking out Russ's guide on the Torch Deep Learning add-in as well as his discovery talk at this year's Discovery.
With that said, if you have any questions, I'd be happy to answer them at: ryan.cooper@jmp.com. That concludes my presentation. If you have any questions, feel free to let me know via email. Thank you for listening.