using plasma -enhanced chemical vapor deposition .
This is a bit different , I think ,
than some of the typical work that 's done in industry ,
where it 's a continuously stirred reactor and you can always mix things .
Regardless of what happens , you can always measure outputs .
But in plasma-enhanced chemical vapor deposition ,
it 's really discrete pockets of stability that you have to work with .
Even though we can set up a large parameter space ,
there can be spots within that parameter space
where you may not be able to strike the plasma
or it could arc because the power density is too high .
Since we have a large number of deposition parameters ,
we need to use a design of experiments
to effectively explore that parameter space .
Even if we 're able to strike the plasma ,
there are still issues with thin film uniformity .
We 're depositing nanometer films
with nearly perfect uniformity across a 12 -inch wafer .
Once we get that , we still have to hit the targeted film properties .
We 're going to talk about how to use PECVD
to develop new thin films from new precursors .
The first thing we 're going to do is talk about Precursor 1 .
From what I was able to read from the JMP tutorials ,
the Definitive Screening Design is a very effective way
to screen a large number of main effects in the fewest number of experiments .
That 's key to the work that we do.
We want to get the right answer in the shortest amount of time
with a data -driven approach .
We used a Definitive Screening DOE to explore seven factors in 26 runs .
What we 'll do is just open up that initial DOE .
This is the setup we came for the Definitive Screening DOE .
Here are the seven different factors we're varying for the deposition,
and our output is going to be this parameter Y,
and we 're trying to maximize it .
If we look at the range of parameters for this type of PECVD processing ,
this is a very wide range of initial parameters .
Again , we 're trying to screen for main effects ,
and our outputs are ranging from, say , 9-34 , and our baseline was 21 .
We do see an improvement there .
One of the things that I always like to do when we do a DOE
is include a center point replicate or a repeat run
to see how reproducible the instrument is , as well as to make sure
that the statistics we generate within the design are valid .
These are the two center -point runs ,
and you can see we get excellent reproducibility .
The other thing that 's really nice for us to do before we get into fitting the model
is just to look at the output variables
and try to identify any trends that we can see .
Is there anything we can identify quickly that we can attribute the main factors to ?
Here, there 's four points with a Y value of greater than 30 within the DOE .
If we select those points ,
it 's nice to see if we can see any trends associated with these data .
One of the fastest ways I found to do that is to quickly do a multivariate analysis ,
and we can do this graphically .
What we 're going to do is take all our factors
and then our output variable ,
and we 're going to generate a multivariate analysis .
Here in this graph , this is our Y value .
You can see as we go from 10 to 30 ,
the four values that are the highest are highlighted ,
but the rows are the various factors .
Here we can see for helium ,
we have the highest values at the high and low splits .
For precursor , we have the high and low splits .
But for temperature and pressure ,
we have the highest values at the lowest splits .
It 's really , I think ,
a good indication initially before we 've done any model fitting ,
that temperature and pressure could be important variables for us to look at .
If we go back to the table,
the other nice thing about JMP , it 's very powerful because...
Again , before we do the definitive screening ,
we can use a predictor screening
to identify what are the most important factors .
Again , we use the standard analysis , input our factors , our response is Y ,
and you can see what the predictor screening is telling us
is , yes , pressure and temperature are very important .
But one thing that we didn 't catch
in that multivariate analysis is the precursor flow .
These three factors , pressure , precursor flow , and temperature
appear to be dominant in giving us the highest values of Y .
Now I wanted to fit the model because I think
the real power of the DOE is not the runs in the table ,
but it 's the response surface model that you can use
to get predictions for improvement as well as directions to further explore .
But when I went to analyze it , it wouldn 't work .
It turns out we were only able to complete 25 of the 26 runs ,
and I was not aware that the Definitive Screening DOE ,
the default analysis would not work if you did not complete all of the runs .
At this point , I contacted Jed at JMP to help me understand
how I could get some models out of this data
that we carefully collected over a period of time .
I 'll turn it over to Jed .
When Bill called ,
like he said ,
when he hit that script that saved to the data table
of Fit Definitive Screening , nothing happens .
If you look over here ,
the log is saying , there are runs that are not fold over center point run ,
and it 's run 17 , which is obviously the run that was missing
and couldn 't be completed in the experiment .
What Bill wanted was a way to still fit that Definitive Screening model .
We came up with two different approaches
and had three models that came about from those two different approaches .
The first one is related to the Definitive Screening .
The designs of these types of models, of these experiments,
are always fold -over pairs , where there 's a pair of opposites .
If we can find that fold -over pair , or the twin , I guess , of this row 17 ,
we should be able to exclude both rows
and then fit the definitive screening design .
We just needed a simple way to do that .
What we came up with was basically to use a couple of shortcuts .
I 'm going to first standardize the attributes of these columns
and change the modeling type to ordinal .
As I do this , you 'll notice that my ability to select has changed .
That helps when I look at a data filter .
Now I have boxes rather than histograms , so it just makes it faster to select .
What we need to do is find the opposite row of this .
I have this row 17 selected , and you see that it 's high , high , high ,
low , low , low , and then I 'm out of memory space .
I 'm remembering high , high , high , low , low , low .
I need to find the opposite of that ,
which is going to be low , low , low , high , high , high .
If I just come over here and start working my way down , low , low , low , high ,
by the time I get to just four of the runs
so just more than half of the factors selected ,
now I 'm down to just one matching row .
It just so happens that the very next run was the fold-over pair in this experiment .
We can select both of those runs , exclude them ,
and then go back in into the column properties
and change that modeling type back to continuous .
Now when we hit that Definitive Screening button ,
it works .
We can run that model and see that it 's predicting fairly well .
We can see the profiler , but we also were really aware
that one of the runs out of 26 , that 's almost 4 %.
We 're throwing away roughly 4 % of the information by excluding this .
What we 'd really like to do is not throw that information away ,
find a way to use that .
We used the Model Screening platform in JMP Pro to run a bunch of models
and then select the best .
The two that came out the best were a Neural and Stepwise model ,
and I can walk through those really quickly .
The Neural model was with our response and our factors .
Since this was a DOE , we 're going to do the minimum holdback ,
and I 'm just going to choose a random seed so this is repeatable .
The Model Screening platform generally suggests about 20 boosts .
If I hit Go here , I get a pretty good Rsquare across this.
Maybe I might try to tune this model by adding some more parameters ,
but when I do , I can see that R square is not really changing .
I don 't think I want to add more parameters
and risk overfitting the model .
That was one extra way to do it .
Then the second model that showed up easily for us
using the Model Screening platform was to do Stepwise .
The way we did that was we put our output here
and then use the shortcut to do a response surface .
That includes all main factors ,
all squared terms , and all two -way interactions .
Then if we change this to Stepwise here , we can hit the Run button and Go .
Now JMP is going to enter and exit everything
until it finds the model that fits the best .
We can go ahead and run that .
Now we have three models that we want to compare .
What I 'm going to do is I 'm going to take this first model and save .
I 'm going to publish that prediction formula
to the Formula Depot .
I 'm going to give it a really quick name , and we will call this DSD .
Whatever , I can 't type .
We 'll call it DSD , and then close it .
We 'll do this with the Neural as well .
We will publish that prediction formula ,
give it a name ,
and do the same with this final model
where we will publish that prediction formula .
This last one was , we called it Stepwise .
Now I have these three models and I can compare them .
We can run the Model Comparison platform for all three of them
from within the Formula Depot.
We can get a rank of the Rsquares of those models .
We can look at the actual versus predicted ,
and we can see that they 're all predicting about the same .
We can also look at the predicted by row ,
and we can see this one point from the Definitive Screening Design
is the one that was left out when we fit the original Definitive Screening Design .
It seems to be important , and probably most importantly ,
we can look at a profiler for all of these against each other .
If I turn off the desirability , we can see how these models compare to each other .
For example , we can look at that Definitive Screening Design
and see that it 's showing some curvature where the other two models are not .
Maybe we can look and see over here
that the curvature is different for each of those .
Then the question becomes , which model is best and how do I know ?
Then that brings us back to Bill .
Thanks , Jed .
Let me share my screen .
Can you see my screen ?
Yes .
I 'm going to just ...
Jed saved all that to the Formula Depot, so I 'm just going to execute the script
that will take us to the Formula Depot that he 's already saved .
Again , we 'll then just go right to the profiler .
We 're going to fit all three of these .
Then we have the profiler .
This is what the real power , I think , of the DOE is,
because the Prediction Profiler , we can optimize and maximize the desirability .
The response surface models will tell us what combination of factors we need
to get the highest elastic or highest value of the output parameter .
What was really eye -opening for me is if you look at the values that we get
when we do this optimization , two of the predictions ,
the Neural net and the Definitive Screening DOE,
are giving us values of the Y parameter
that are greater than anything we saw in our initial data table .
We had a maximum value of 34 .
Let 's see.
I 'm sorry , I just got to get that screen back .
Typically , it 's very unusual for me to see this with the DOE .
Typically , the model , if you maximize it ,
is generally close to what you see as in the table .
But in this case , it looks like we really have some low -hanging fruit .
We needed to test this combination of parameters
and really see if that prediction was valid or not .
If we go back to our JMP journal...
I just want to show you what happened .
We took , I think this is the prediction from one of the Neural network fits .
Again , the highest value in the Definitive Screening DOE was 34 .
The model prediction was 42 , but when we actually ran it ,
we saw some artifacts in the film that were not acceptable .
The plasma itself was stable .
There was no way to see this until the wafer came out of the reactor .
But you can see there 's a bullseye pattern ,
which is due to film non -uniformity .
In this case , it 's very thick in the middle and thin at the edge ,
which gives us this bullseye pattern .
Then if you look carefully ,
you can see all these small dots over the wafer ,
which are actually the holes under the shower head .
The shower head has thousands of small holes where the gas comes out .
In this case , we have a shower head pattern and a bullseye .
I think the model is telling us what direction to go .
But again , plasmas are challenging to use .
Even though the model was telling us
this film should have the highest value of Y ,
the film itself was unacceptable .
Then we have to rely on our process and theoretical knowledge of the process .
We know that argon has a lower ionization energy,
and if we substitute argon for helium and the plasma ,
we can get a higher plasma density ,
which may help us overcome these challenges .
What we did is switch to argon , and you can see ,
although the film is not perfect , it 's much more uniform
and certainly good enough for us to get the physical properties
of the film that we can use .
In this case , we were able to hit a Y value of 46 ,
which again , is much greater than 34 .
We 're certainly trending in the right direction .
What we really wanted to do is ,
are there any opportunities for us to further improve the film ?
Again , that 's where the Prediction Profiler
or the response surface models are very powerful .
If we just look at the trends that we see here...
I 'm just going to blow these up
so we can see them a little bit better for each of these cases .
The data is really telling us,
in certain cases , there 's things that we really want to investigate .
Lower temperatures
look like it 's definitely favoring the highest value of Y ,
pressure appears to be a key parameter ,
and low -frequency power for this initial DOE
looks like in two cases ,
you want to go to higher low -frequency power .
The Stepwise is giving us the opposite .
But you can see that this is really a blueprint
for us to do an additional design to see how far we can push it .
Can we go to lower flow rates, can we go to lower temperatures,
can we go to lower pressures and further improve the film properties ?
It 's really sequential learning ,
and that for me , is the real power of the DOE .
We don 't really have time to go through all of that ,
but what I did is put together a new JMP table with the results
from our sequential learning for this set of experiments .
Here is the same data we saw in the Definitive Screening DOE .
Here are our Y values , and we 're ranging from 9 to 34 .
The different colors are the different DOE .
Here 's the next DOE that we did .
What you can see is based on the trends in the Prediction Profiler
and the response surface models ,
we fixed the low -frequency power at the highest setting we could .
It turns out physical limitations for this plasma chemistry prevented us
from adding any more than 20 % low -frequency power .
We also fixed the temperature .
We can 't operate for this chemistry below 200 .
We know that lower temperature gave us the highest value ,
so we fixed these two
and then did a five -factor DOE focusing on lower precursor flows ,
various spacings , the higher powers , and certainly the lower pressures ,
which was indicated as one of the most important parameters .
If you look at the Y values here ,
you can see we 're definitely trending in the right direction .
Now we 're going from mid 20s up to 56 .
We 're certainly above the 46 we saw there .
Then we did the same learning .
Again , the Prediction Profiler indicated what parameters we should explore .
We did another DOE .
In this case , we fixed different parameters ,
but you can see that the trend is the same .
Now we 're hitting up to 66 in terms of our Y value ,
and we did one final experiment ,
and in this case you can see basically the sum of all the knowledge
that we gained .
It turned out when we switched to argon , you could add more low -frequency power .
You could go from zero to 40 %.
In our final analysis , this DOE showed that unlike the first DOE ,
after we finetuned everything and switched gasses ,
the low -frequency power had no statistical impact on the Y value .
We set that to zero .
We found out the lowest spacing was the most important .
Our sweet spot for pressure was 2 .3 torr ,
and we did want to operate at the lowest temperature .
Really we had a three -factor DOE between total power , precursor flow ,
and argon dilution to really dial in the films ,
and we could hit a maximum value of 84 .
I summarized all that in a box plot here , which I think really shows
the ability of the DOE and sequential learning ,
where we started out with a seven -factor Definitive Screening DOE with 26 runs
and ended up with a three -factor I -optimal design with 16 runs ,
but you can see our continued improvement .
This was our reference target , so we still have more work to do .
But this is a very powerful way for us to screen seven factors
with three -level designs in a very short period of time .
I do think it 's worthwhile just to point out
how efficient these new modern DOE s are .
If we looked at what we 're really doing ,
we have seven factors that we started with .
All of these are three -level designs .
For a three -level , seven -factor design , that would be over 2 ,100 runs .
We could run 90 out of the experimental designs
and achieve this increase in the Y value .
I think these modern designs ,
the optimal designs combined with the Definitive Screening DOE s
are a very powerful tool for us to get the most value
with the fewest number of experiments .
The final thing I want to touch on
is when we switched to a different precursor .
This is really a different challenge we faced .
The goal here was to evaluate different precursors
to compare how it stacked up against the initial baseline film .
What we tried to do is use all of our learning from those four DOEs
and become even more efficient.
Instead of 90 runs, can we do this in 52 runs ?
With Jed 's help , we put together an eight -factor A -optimal design ,
but what we found is that the chemistry was shockingly different .
All of the parameter space that we could operate easily with Precursor 1
was not the case here .
In fact , I put together a slide to show you
how bad some of these films could look , so we could get perfect films .
You 'd be hard -pressed to tell there 's a film .
This is a nanometer -thick film ,
edge to edge on a 12 -inch silicon wafer , perfectly uniform .
Then we would have films that look like this .
Obviously , that 's not a design that we wanted .
But the challenge that we faced was , we 're doing an eight -factor DOE ,
and we 're trying to do this quickly and efficiently ,
and 30 % of the runs failed .
I 'm looking at a table with eight different factors .
How do I pick out the factors that are contributing to this ?
What we did is we created a table for our eight input factors
and then identified all of the films
that had delamination , arcing , or other issues ,
and then created a metric , just a film metric, pass or fail .
It turns out we can fit this categorical variable
and see if we can get a model that will help us understand
what is really causing these issues through all these films .
The first thing we can do quickly
is again go to our Model Screening platform,
Predictor Screening ,
and get a handle on what factor , if any , is really controlling film quality .
If we look at this , it 's pretty clear and it 's shocking,
because this was not the case with Precursor 1 .
The flow rate of the precursor was by far and away
the most dominant factor impacting film quality .
But what we needed to run these experiments
is not just knowing this factor ,
but what value can we safely run to generate quality films.
That 's where we did the Neural net .
Again , we 'll go into Predictive Modeling, Neural net,
we 'll take our factors ,
and we 're going to fit film quality as a categorical variable .
I 'm going to go to boosting of 20 , as Jed mentioned .
That 's typically what the Model Screening comes up with ,
and we 'll generate our model ,
and you can see we get excellent Rsquare values .
This is a categorical model .
I believe that looking at the ROC curve
provides insight into how well the model fits .
If the curve is along this diagonal , it 's just basically a guess .
This looks like a half square wave .
These are basically perfect fits in the ROC curve .
Then the question is , how can we utilize that data ?
The nice thing about the Neural net is that you have a Categorical Profiler.
I can execute the Categorical Profiler ,
and now I set this up where we know we want to operate at lower pressures .
We know from the previous work we want to operate at lower spacings .
We want to go to the lowest possible temperature.
We 'll just set this in the middle .
Then basically we have this profiler that tells us
the primary factor affecting this is really the argon flow rate .
If we can keep our flow rate below 400 sccms ,
we can have a 100 % success rate
for the films that we 're trying to optimize .
With this , we set up a new DOE ,
limiting the total flow rate to 400 sccms .
We 'll go back to our JMP journal .
We were able to come up with a new design
and complete 41 of the 42 runs , and we 're still executing that study .
But it just shows how powerful
the Neural network is for a categorical variable
where we can do this in an afternoon,
where at one o 'clock we found these films weren 't working .
Three hours later , we had a model that told us how to set up a new design
and we were executing that later that day .
I think that 's the material we wanted to cover .