Semiconductor Spin Coating Application: Optimization of Coat Uniformity (2025-US...

My project covers the application of solution coating onto a semiconductor wafer and the optimization of the process inputs. The goal is to improve targeting capability of outputting a consistent thickness and minimizing the thickness standard deviation, providing a more uniform thickness measurement across the entire surface of the wafer. The tools used in JMP include DOE, Fit Least Squares analysis and the inclusion of Paramater profiler with noise/standard deviation simulator to find the most desirable settings.

This poster shows how JMP's analysis tools can take the input ranges that an equipment is capable of performing and then fine tune the ranges for the most desired and most capable output. For this example, it means minimizing the standard deviation of thickness output across the wafer while still being able to target desired thickness coating.

This is Nicholas. I am an engineer at Medtronic, and I utilize JMP pretty often. For this presentation, I'll be focusing on spin coding and the technology around it and how we utilize JMP to build a DOE, and then putting together a fit model to understand using the profiler and understanding how to optimize and target certain thicknesses across the wafer coating layer.

As an introduction, like I just said, we're using JMP in this presentation to look at semiconductor coding. How that technology works, as you can see here in the diagram on the upper left, we have a solution that we dispense centrally onto the wafer, a wafer that is on a pedestal that will spin as a spin coating is referred to. We then have different set profiles where it will be running at certain speeds for a certain duration of time.

Usually how we run it is we'll have an initial step that will run at a slower speed, and it will spread that solution across the entire surface. Then we're able to increase that speed for a longer duration of time, and then depending on the speed that's set there, we are able to hit different thicknesses across that coating layer.

The purpose of this one, and what we want to define within our DOE, is utilizing the DOE to target different thicknesses, understand how process windows for the different spin speeds and the dispensed volumes can impact our coded layer, both in uniformity across the entire wafer surface and the output thickness.

You can see on the right here, these are our different factors that are going in. We're going to be doing a six-factor DOE. We have our low and high values that we'll set, and I'll jump through these when we go into the demo and how it's actually applied in the DOE. But essentially, what we're trying to do is build model using these windows, trying to get a good R-square, which will tell us how effective that model is at predicting our outliers, and then using that to then adjust and target certain outputs, such as thickness, and then trying to minimize that variability across surface or uniformity.

These will go a little more in-depth once we get a demo started. But you can see here on the left, this is essentially like an overview of how our coating thickness will look. We're trying to get as smooth across as possible using a contour graphing. We're able to see where there may be some high points or some low points across the surface. We then move into a fit model, and this is essentially what we will see where all of our data runs from the DOE are graphed. I believe we have a bit of stack up here, which is why I see tall ranges here. But then we also had a few outliers from one of our wafers that we built. I'll cover that when we go to the next slides.

But then essentially from those executions and analysis, JMP then puts together a profiler, which then you can set desirability to say, okay, I want to hit 10 microns in thickness or 10,000 nanometers, as you can see here. Then we also want to adjust the factors to minimize any standard deviation across the wafer surface. I'll dive a little bit more into that in a second.

Here's just an overview of the DOE. When you put all these together, the low and high values of our six factors, then we'll spit out a table that includes all the runs. This is the run and execution that we did. Actually, I'll probably jump into the demo now, pulling up JMP. One second here. If I were to step through how to do a DOE, what we usually will do is the... Give me a second here. Classical... No, sorry. Custom DOE. It's right here on the top.

That factor will bring up a pop-up table where you can put in your outputs and your inputs. For this build, we had our thickness target. What I added there was a response for our outputs. I wanted to match a certain thickness. That's our response name, thickness, matching target. For us, our low range will be 8,000, and our upper will be 12,000. These are the nanometers in-depth on the coating layer that we have. I can add an M for the units.

Then we also want to look at our uniformity. I'll look at standard deviation within the thickness measurements across the wafer, and we'll call that uniformity. We want to minimize that as much as possible. When you have a higher standard deviation, you see a lot of variability across. We don't really have a unit for that.

Now we'll have our factors here. For us, we had six different factors that were continuous. We had our dispense volume, dispense spin speed. We'll spin the wafer as we dispense speed. We'll spin the wafer as we dispense it centrally on the wafer. We have spin one, speed, spin two, or spin one, time, and then spin two, speed, and spin two, time. For this step, it's a three-step factor. We have our dispense, spinning it, a spin one, where we talked about how we spread the solution across. Then spin two is where we really target and focus on... Oh, no, sorry. I apologize.

Dispense speed is the dispensing of the... Or dispersing of the solution across the wafer. Spin one is where we were actually going to be targeting, where we speed it up and try and hit a certain thickness, and that's going to be the most significant one for hitting thickness. Then our spin two, which is our last second, really fast increase in speed, that's really just to get rid of any buildup along the edges. Really just a quick whipping of the wafer to remove any extra buildup of solution around the edge.

For this build, I'm going to try and go through this a little quicker here. We did one volume with 8.5 milliliters to 10.5 milliliters. We had 100 as the RPM to 400 as the RPM. That's just to spread the solution across. We then looked at 500 to 700 revolutions per minute to hit a certain thickness target. Throwing the solution off to thin it out. That will be going for 40 to 60 seconds.

Then our final one was that really quick snap speed. We really don't know, at least going into this build, what speed to run at. We did a pretty drastic range of 1200 to 2400 RPMs, but that's only from either two or six seconds. These process windows we have are pretty large. We're expecting there to be some pretty drastic results on some of the most extremes. But this is essentially how we just set up our DOE, our Xs and our Ys. I'll hit continue.

I will put in our units really quick. We'll have our revolutions per second and our time, which is... Sorry. Revolution per minute as the unit for speed and seconds as the unit for time. For our volume, we're in milliliters. Again, for speed is RPM. For us, I'll usually add a secondary interaction just to add some additional robustness to the model. This will also understand how each of the variables interact with the others. You can continue to go up in third and fourth factor interactions, but that will be a little too extra. We don't need that for this build.

For our build, we ended up running at 24. The nice thing about JMP is it'll tell you what your minimum required is and then what our default is. In the semiconductor industry, we work with cassettes that carry each wafer, and our cassettes hold up to 24 wafers at a time. We're looking at in between the default minimum, 24. That matches exactly That's exactly how we would load up our equipment to run. I can then set that. We'll have 24 wafers run through this entire DOE, and then you would hit make a design.

JMP is going to compute that really quickly, and it'll spit out a table with all of our factors and then columns for our results. You can see here it has a run order, 1-24, and then each of the factors that we set. First way wafer, we would run and want to run each of these control factors and then, likewise, we do it through all 24 wafers. What you can do then is you can also look at the power analysis and see how strong of a build this is. It's in the 60, so it could be a little stronger, but it'll work for our need of this initial process window analysis.

I'm going to hit make table. The run order was randomized, and then that kicks out this table here. We have all our factors as it showed in that last table, and then it gives us the options of our thickness output and our standard deviation. I accidentally added a third factor, which I didn't realize. We can ignore that one for now. But thickness and standard deviation are outputs.

If I were to use the data that we'd already run for this, here is our results. We have our wafer mean thickness and our standard deviation variation throughout those wafers. Now, this is just the average data points from that data set. What we actually capture is around 1400 data points per wafer. It's quite a bit. I can show it here that we have a very large table of all the data points.

How I had this set up is we have our six control factors here, and then our actual thickness the wafer data point as an X and Y axis across the wafers, and then the mean output, and then standard deviation across those wafers. From there, what I usually like to do is look at it as a PGA roll. Practical, graphical, analytical.

First, we'll go through practically and look at all the data sets. If I were to just show you a contour plot of what our surface profile looks like, and this was shown in the slide earlier. Each one of These data points is a XY coordinate measurement that we do to measure thickness. Now, we do quite a bit of data across the entire wafer, but we know that around some of the edge points, there are non-critical areas. There's no critical components that we would cut out of this wafer. We can eliminate those data points.

I went through and I excluded and hid some of those non-critical data points in here already. An easy way to do that is when it first gives you the data, you can then click this little header graph, and it gives you a summary table of each of the columns. One thing to do is go through and make sure you have all the wafer IDs. You can see if you have any immediate outliers to find on here. It can help you clean up the practical if there's known outliers or defects that you'd see.

From there, then we would then go to the graphical. What I would do is I would go to Graph, Graph Builder. For this, we would look at surface profiles across the entire wafer. We have our XY coordinates, and then I would be looking at thickness. I would put thickness as my color for each data point, and then I can change it to a contour. Now, this is currently looking at every wafer we have. I will throw wafer ID up into the wrap, and this will give me every wafer ran. One through 24. I can see them all categorized here.

You can see it's quite a range of thicknesses from this DOE. We went from 2000 nanometers up to 13,000 nanometers, depending on how that target control factors were set up. But what we're really focusing on is here is determining where do we see a lot of non-uniformity. Here we see centrally this wafer is a lot higher in thickness, and then on the outside, there seems to be a lot lower. This probably doesn't have very good uniformity.

Another way to test that is we can go in, and I believe it's under Analyze. I'm going to do an individual plot chart. I have to remember where I put this. Here we go. Control Chart Builder. With this, we can then do thickness as our Y, and we can do wafer IDs. Oh, sorry, not wafer IDs. Actually, yeah, let's just do that. Then I'm going to shrink this down, so it goes to about... Let's try 10,000 or I guess 11,000. Zero. Okay, I'm going to need to bring that a little bit higher. Let's go about 15,000. Because I think our highest wafer was around 13.

You can see here between each of the wafers, it's split up between wafers each run, and you can see how the data points are... Some wafers have terrible uniformity, where others are a little better. I'm actually going to, just for the sake of time, pull up one that we already had split between each of the wafers. I built this a while ago, but once it starts loading. Here we go. We also have our individual data chart. It was split up between each of the wafers, and you can see the X bar and the mean and the upper and lower control limits for each.

If you were to take this and compare it to... Let me try to make it, so it fits good on the screen. But we have this, and then I'm going to pull up the control chart also. Oops, wrong one. If we were to look at some of these wafers where we see this wafer here, it looks like this one has really bad uniformity going across. What I can do on this chart is I can highlight these data points. Doing that, I can then see... I thought it was going to show on here.

But usually when you highlight those, it'll show what data points are on there. Maybe because it's still zoomed out, you can't see it. But I have the list right here. It's the 09-D3 wafer. The 09-D3 wafer here, you can see that it's very high thickness in the center and not as high on the outside. Another one that we had some weird data points on was this 11-D3. We had a couple high fliers here that seem to be outliers. We're not 100% sure why those are there. They weren't part of the excluded data points that I took out.

But from those, I believe it was 11-D3. If I were to look into... Sorry, computer is going a little crazy there. Let's make this a little bigger. It looks like if you look at 11-D3, we did have a few data points on here where there were some high little points that were causing issues along the wafer surface. Otherwise, it looked pretty similar between this middle range. That's something where we can then look into why do we have these weird data points? We can open a root cause.

That's a nice thing about the visual graphing of JMP. A root cause analysis can be determined like, okay, what's causing this? Is there FM or some debris on the surface when we did the measurements? Or is there something else interfering with those data points right there.

After that, once we get good graphical analysis, we can then jump into doing a fit model to understand how well these data points fit together. If you're going to analyze fit model, you can then take the data. You have your thickness as the output. You put that in your Y and then your standard deviation. Doing this, this is how we'll find the capability of creating a function for taking the controls and targeting certain outputs. That's essentially what we're doing with this fit model.

Then with that, we can then open a profiler, which is then how we can adjust different control factors and see what the expected output would be. Thickness and standard deviation are outputs. Then these are our six control factors. I'll add those in there. Actually, I'm going to... Okay, yeah, that'll be good. We'll run that. I'll usually run it as standard lease squares. Standard lease squares does a good job of giving you a best fit model for all the controls to target the outputs. We'll run that.

This is what popped up then. As you can see here, all of our data is significant to impacting the outputs. This was expected. We gave it a pretty wide process range. Essentially, we just can't remove any factors to simplify our equation. We'll just have to keep them all in and included.

If you were to go down to summary of fit, this is where we can see our r² is 0.909. It's a strong r², and then we have a root-mean-square error of 502. Another thing we could add is the interactions since we created the DOE with that as well. I'm actually going to start over and do that also. We'll do a fit model, thickness, standard deviation, and then the six control factors. Add those in there. Oh, shoot. I did it again. I will remove those. If I had those selected, I can then hit macros and do factorial to the degree. I had a second degree here, and that's how we get our second interactions in there.

Now, if I were to run it, we will add thickness and standard deviation to our outputs. I'll let it run. All right, so much longer list. You have all the interactions in here as well. Again, everything is significant, so I can't simplify that equation. But if you go summary of fit, we now have a 0.97 or 0.965. We have a much stronger estimator to how to predict these data points.

I can hit this little red button here, and what we can do is add in the profiler. I'll do profile there. I'll then pop it up on the bottom. We can close some of these to simplify it. You have the summary. I guess I should have covered this first. We have our response thickness for... Or our response for thickness, where we have 0.967r². Our capability of predicting thickness is pretty strong. Then we have an r² for our standard deviation, which is also 0.965. Also very strong correlation for predicting that.

I believe we put in here. Oh, somewhere to fit this there. Okay, never mind. I guess we'll focus on the profiler here now. Let's see if I can get this window to fit better. For the prediction profiler, what you can see here is we have our control factors on the X axis and the thickness and standard deviation, which is our outputs on the Y axis. For this, I want to be able to set this Profiler to predict the different desirabilities.

What you go is to go into optimization and desirability. We hit desirability functions. That will now add a desirability column. When you hit one, that means it's the most desirable. It's a very strong correlation that whatever you have set for your Xs will hit these targets. But before you do that, you have to then go in and set your desirabilities.

For thickness, I want to match target. For this one, you put the middle as your match. For this, I'll say 9,000. Let's just say 10 to a round number, 10,000 as our target, and then 12,000 is our max, 20,000. Then 8,000 is our lower. We want to be within this range. We can hit okay for that. For standard deviation, we just want to minimize.

Since we're just minimizing, going for that lower end, I'll just keep those numbers the same. Now you can see for desirability, how it's currently set up with these outputs, you'll get around 8.3 on the thickness. You can see that's not hitting where we want on the desirability. Standard deviation is a little high. What you can do is now that you've set your desirability, you can ask JMP to maximize.

What it's going to do is it's going to take those inputs that you had and adjust them so that it gets the most desirable outcomes. To hit that almost 10,000 nanometers for our thickness and to minimize that standard deviation, these are the control factors that it requires us to set. Some of these will be slightly off, so I can just adjust to 12,000. Then this I'll just change to like 550. It's inputable numbers into our equipment.

To go a little farther, one thing that I like to add is a little bit of noise to the predictability. You can add a simulator, and you can add a little bit of random noise. The randomness within this noise is being drawn from the square root that you can see here. Root means square errors 303 for the thickness, and 21, about 22 for the standard deviation. That is what is utilized to create that noise within the system here.

For our controls, our volume. Volume is in a real-world manufacturing situation. We'll never have a perfectly fixed constant volume that's going to apply. There are some noise factors that will adjust that. I'll set a little bit of random noise into that input factor here. Otherwise, the RPMs and the times are pretty consistent and fixed on those data points, so I can keep those.

When you hit simulate here, you're now creating essentially a capability variability and a bit of understanding of where that noise and variability will come into your equation. A fun thing with this is you can just then play around, adjust these values. Say I wanted to change this to eight, see how that impacts the outputs, see how that impacts our desirability to hit those. Then you can also... Let's change this back to 10.5. You can see how the snap, the very fast speed is very impactful to our output. If you adjust that, you can see how it's really adjusting that thickness, and now we're not within our desirability.

If you were to try and say you wanted to adjust to try and hit a different thickness, you could then just reset your thicknesses. Say maybe 9,000 nanometers is more optimal for us. We can adjust that, keep that the same. Then you can just run the optimizer again, maximize it, and it'll adjust the settings. We'll be hitting right about 9,000 nanometers in thickness.

Yeah, I think that's everything I was looking to cover for this presentation, covering both running a DOE to understand our process windows and ranges and how they impact the outputs. Then from that, we can use this desirability to set a nominal data point. Then, yeah, this fit model is how we can then create our equation. As we continue to build wafers off of this, we can then add those data points into this table and refresh this fit model, and we get stronger and stronger correlation for our prediction profiler. That concludes my presentation.

Presented At Discovery Summit 2025

Presenter

Nicholas Schlichter

Skill level

Beginner

Beginner
Intermediate
Advanced

Files

2025-US-PO-2588.pptx

Semiconductor Spin Coating Application: Optimization of Coat Uniformity (2025-US-PO-2588)

Presenter

Skill level

Files

Basic Data Analysis and Modeling

Quality and Process Engineering