My name is Chris Jackson.
I am an Applications Engineer for Centrotherm.
We design and build point-of-use gas abatement systems
for use in the semiconductor and other industries.
Today, I have the opportunity to give a short presentation
on how we found a space for JMP in our part of the industry
and how it helps us both in troubleshooting
for industrial applications
as well as for assessment
and justification of continuous improvement initiatives,
engineering changes, things like that.
A little bit of background just to get everyone on the same page,
I want to say a couple of words about what point-of-use abatement systems are.
I've got a little cutaway of one of our tools here on the side.
The short version is this:
you've got a manufacturing tool up on the factory floor
doing whatever it's doing in the semiconductor manufacturing process
that produces harmful gasses as a byproduct,
greenhouse gasses, toxic gasses, flammable gasses.
Generally, things you don't want to go in the atmosphere.
Then our tools take those waste gasses in,
they destroy them through thermal energy,
they wash them out,
and you release clean air to the factory exhaust.
Because these tools are safety and environme nt-critical,
a fault in one of them
means that your production line is at least in part shut down.
If you can't treat your byproducts, then you can't run.
In a high- volume manufacturing environment,
as so many semiconductor FABs are,
even small delays are incredibly costly.
We as suppliers and servicers,
have to have a means to quickly identify problems
and bring the tools back online.
Historically, troubleshooting usually means
opening the tool,
looking visually to identify failing components
often after some period of root cause analysis.
But with a modern FAB environment
and the data generated by SCADA or IoT systems,
we have mountains of data available to investigate faults
before we ever touch the equipment.
That gives us a way to guide troubleshooting in the field,
and in some cases for intermittent faults,
it even lets the factory keep running while we investigate digitally
rather than physically
minimizing the time lost to troubleshooting and investigation.
The problem with this mountain of data is a scale issue.
The higher the resolution of your data,
the better look you can get at what's happening instantaneously
in any of these pieces of equipment.
That higher resolution however, comes with an overhead.
You need more and more computing resources to effectively analyze it,
and that's where JMP comes in for us
with the capacity to handle very large data sets,
and it becomes a tool for visualization and exploration
that can really drastically improve troubleshooting.
It lets an engineer or a technician
quickly explore and visualize important parameters
within your data sets,
and these data sets are at a scale sometimes that are just unmanageable
for a lot of other visualization tools.
With that, I want to jump right into the first example case we have here,
and we're going to identify an intermittent single- component failure
just through data visualization.
No statistics, no modeling,
just the ability to sift through and visualize the data.
Here we've got a chart showing ionization current versus time.
It's one of a number of parameters, ionization current,
that we use as a health monitor for the equipment.
This tool was having issues in which it would run for a couple of days
and then seemingly randomly fail and shut down.
For context, this current should be a flat horizontal line at 25.5,
so it's pretty clear from the outset that we have a problem.
It's also pretty clear what I was talking about
regarding data set size.
This data set right here is almost six and a half million rows.
Six and a half million rows with,
when you pull in all of the tool parameters,
500 columns.
The file for this data set is about 20 gigabytes in size,
absolutely massive amounts of data.
Before we even do any statistical analysis, like I said,
we can start to do some problem- solving off of this data set
just with visualization.
Initially, it doesn't really look like there's any clear shape to this data.
We know something's wrong, but we don't know what.
But when we zoom in,
all of a sudden we start to see some structure.
This looks pretty periodic to me.
We zoom in a little bit more
and we see that it is in fact very periodic.
Each one of these little spikes down, disregarding magnitude,
is timed five minutes almost exactly from each other.
That immediately begs the question then, do we have some component,
a valve, a flow controller, a motor,
something that actuates every five minutes?
We identify that component.
Now we have a really likely troubleshooting culprit.
The troubleshooting plan changes from open the tool and investigate,
which could take a couple of hours,
to open the tool and change this one targeted component.
We just shrunk the actual time that we need to be in the equipment
from a couple of hours looking at everything
to see what might be failing
to a single hour, get in there, change this part, get back out.
In this particular case, that was the failing component,
we were able to identify it.
Problem identified, plan made without ever having to open the equipment.
We were able to get there with just the conclusions
that we were able to draw from visualization.
Of course, JMP is not just a tool for visualization.
It also has at its core a very robust suite of statistical analysis platforms.
If we start to apply those to the data,
we can get even more exciting and interesting results.
I'll just jump right into the second case here.
In this case,
we're looking at a specific tool, which is working fine most of the time,
but it does have occasional problems with buildup,
sometimes we got to draw our PM in a little earlier than we would like.
We want to take a look at our health parameters
and see if there's any abnormalities, any optimizations we can make.
The approach that I use here
is applicable for, really, any industrial application
that has defined operating modes.
Because we can draw those modes out of the data very easily
using clustering.
In this case, our abatement has, or this specific abatement,
has three pretty well- defined operating modes
based off of these two input gasses.
I use K Means clustering.
You could use whichever version of clustering you prefer.
But I run that over the data to sort a ll of our rows, all of our points
into these three operating modes.
If you have more than three operating modes,
obviously, you can use more clusters.
But it also gets interesting,
what if you don't know how many modes you have?
Maybe they're customer-defined,
or maybe there's a suspicion that,
"Hey, could there be some interstitial mode here?"
Maybe the transition state between two of these operating modes.
If you want to investigate that way, you can use iterative clustering.
I did that down here.
You just run from, I used 3- 10 clusters,
and the software will identify what the optimal number of clusters is.
Looking at this, it is correctly identified.
It gives us these cubic clustering coefficients,
identifies the optimal one,
that, yes, as suspected, three is the optimal number of clusters
to sort this data into.
I'm not really worried about these state transitions.
I'm really more focused on the states themselves.
We take that data, we get a readout of it,
and we throw it up onto this 3D scatter plot.
We take some of our tool health parameters,
and we color everything by what cluster they're in.
Immediately, we start to see some interesting results.
We talked about ionization current should be solid at 25.5,
and we see that we have some variability here.
It's dropping below that.
Immediately we know that we have a problem.
But what's more interesting is that every single one of those points
is grouped into a single cluster,
cluster two, which corresponds to this
lowest input gas one, highest input gas two.
Now from an engineering perspective,
if I'm looking to make optimizations or I'm looking to improve tool health,
I immediately can say,
"Hey, this is the operating mode that we need to look at."
That's what I need
in order to start looking at concrete next steps for improvement.
I'm not looking at the tool as a whole.
I've already managed to focus my search to one operating mode.
The last thing I want to talk about then,
having looked at two of these use cases here is,
what are the other advantages with JMP?
Why JMP?
My customers are collecting all this data.
They have ways to view it.
They have scatter systems and monitoring systems in place.
They have ways to parse it.
So why do I, as a supplier/ servicer,
need this platform to view and parse the data?
The answer for me, at least in my case, is the cross- platform compatibility.
If I'm reliant on my customer to chart and generate data views for me,
I'm now taking up their time and their resources
to troubleshoot a problem that I'm responsible for fixing.
With JMP, as long as they can give me the raw data,
I can do all of it myself.
Not only is that freeing up their sources,
it gives me the ability to do my own investigation
independent of whatever system they're using for data analysis.
It doesn't matter if they're using proprietary monitoring system A or B or C,
or if they're using their own IoT monitoring system
from their control engineers.
It doesn't even matter
if they have multiple data acquisition systems
from different vendors.
With JMP,
I can import and manipulate whatever data they give me
and perform these kinds of analysis, sour ce-independent,
do the investigation that I need to do for my customer support
with all the tools for visualization and statistical analysis
that JMP provides.
With that, it looks like we're pretty much at time here.
I know this isn't the traditional use case necessarily for JMP
from some of the folks that I've talked to,
but I hope it was helpful for people.
I'd just like to thank Adam Stover, our CTO,
and Gordon Tendik, our Director of Apps and Technology,
for helping me put all this together and reviewing the work that I've done.
Thank you for your time.