Retrieving Arbitrary True Part Distributions from Measured Distributions and Gauge Characteristics - (2023-US-30MP-1402)

Jerry Fish, Systems Engineer, JMP

Part distributions are easy to measure: parts are built, an operator measures the parts with a gauge, and the results are assembled into a measured part distribution (MPD).

But the resulting distribution is contaminated by errors associated with the measurement system. Random errors, gauge bias, and linearity problems all contribute to inaccuracies in measuring the true part values, so the individual values can never be truly known.

However, if we had a way to estimate the true part distribution (TPD), we could compare it to the MPD and calculate the impact (cost) associated with using the imperfect gauge in terms of Type 1 and Type 2 errors.

It is trivial to estimate the TPD from an MPD if the gauge creates simple normally distributed errors around a normally distributed TPD (i.e., simply subtract variance of gauge from MPD variance to get TPD variance). But what if the gauge has linearity problems? Or what if the TPD has a non-normal shape?

This paper describes a new JSL script for determining an arbitrary (i.e., non-parametric) TPD from an arbitrary MPD and associated gauge performance characteristics. The resulting TPD can then be fed to a second script to determine production costs associated with the imperfect gauge and setting guardbands to optimize economics of the gauge errors. Performance of the estimation routine is evaluated, in terms of shape of TPD, various gauge characteristics, and resolution of distributions.

Hi, I'm Jerry Fish.

I work for JMP.

I support our customers in the Central Region of the United States.

Today, I'd like to talk to you about an add-in that I've developed.

The title of the paper is

Retrieving A rbitrary True Part Distributions

from M easured Part Distributions and the Gauge Characteristics

that go along with the measurement.

Today's agenda.

First we're going to talk about, of course, what does this talk address?

Why is this so important?

Why can't we just subtract variances to get our True Part Distribution?

A little bit about what's behind our estimation computations.

I'll demo the add-in,

including some test results and add some troubleshooting tips.

I'll tell you where you can find the add-in,

and then we'll share with you how you can give me feedback

on what's good and what you don't like about the add-in,

areas for improvements, and so forth.

What are we addressing here?

Well, we're talking about an add-in that determines a True Part Distribution

if you give it a measured part distribution

and if you describe your gauge performance characteristics.

It's pretty easy to conceptualize if we start with a True Part Distribution.

Here's our true part value versus our percentage of parts.

Then we run that through a gauge,

an imperfect gauge that has some variance and bias characteristics.

We will get a measured part distribution out of that.

We don't know, though, what our True Part Distribution is.

What we're talking about is swapping those positions

where we start with a Measured P art Distribution,

we subtract out our gauge performance characteristics,

and we end up with a True Part Distribution.

That's pretty simple to understand, but it gets more complicated

if we have a Measured Part Distribution that is not normally distributed

and/or we have a gage that performs in non-standard ways, you might say.

Perhaps our standard deviation shows curvature with the measured part value,

or maybe it has bias that linearly changes,

or maybe it has curvature as well.

How can we take these quantities, the Measured Part Distribution

and an arbitrary gauge performance curve,

and come up with what the True Part Value was

that must have caused this Measured Part Value?

Why is it important?

Well, we all know that all gages are imperfect.

We'd like to get an idea of this True Part Distribution.

You'll see it referred to as, we go along as TPD.

We can understand our type 1 and type 2 errors.

A type 1 error means our gauge is throwing away good parts.

A type 2 error means our gauge is accepting bad parts.

Both of these, particularly in a manufacturing environment,

are bad things to happen.

If we're throwing away good parts, then that's waste.

We don't want to have waste in our process.

That's just a straight bottom line deduction from our profit statement.

We also don't want to accept bad parts.

If we do that, we ship the bad parts out to a customer.

We're likely going to get complaints, we're going to get returns, reworks,

it's going to damage our company's reputation.

We don't want to have either one of those types of errors.

They both hurt our company.

If we knew the True Part Distribution,

we could estimate the costs associated with these errors.

That particular subject is addressed in another paper

being presented here at Discovery 2023.

With this title and this paper number, I encourage you to look it up.

I co-authored this paper with two

of my colleagues, Brady Brady, and Jason Wiggins.

We need that True Part Distribution to make this assessment.

Why can't we just subtract the variances?

Well, you can.

If your Measured Part Distribution is normally

and your gauge has constant variance and bias across the measurement range,

then you can get to your True Part Distribution.

You don't know the True Part values of individual parts.

You're never going to know that, but you can get to the distribution.

It's simply, under these constraints, under these assumptions,

the variance of the True Part Distribution is just the difference in the variances

of your Measured Part Distribution and your gauge variance.

You subtract those two,

you take the square root, and you get the standard deviation of your gauge.

T he average, I'm sorry, standard deviation of your True Part Distribution.

The average where your True Part Distribution is centered

is simply wherever your Measured Part Distribution

is centered minus the bias of the gauge.

O f course, the question is, what do you do

if your Measured Part Distribution is not normal

or if your gauge has unusual characteristics?

This is how we can conceptualize inputting these values,

and I'll show you the add-in in just a second.

We can have any arbitrary input.

Here we've got for our Measured Part Distribution,

this is our Measured Part value.

T he counts for however parts that we have measured.

Maybe this looks like a combination of two normal distributions.

Maybe it's something a little different than that.

The point is, you can put in any input that you'd like,

any input shape for the Measured Part Distribution.

Then we described the gauge using quadratic functions

for the sigma and for the bias of the gauge.

Normally, all you're going to have is a standard gauge

would be just these constants out here in front,

C0 and D0, and D0 may be zero if you don't have any gauge bias.

I f your sigma changes linearly with part value,

we allow you to put in C1 and D1 if your bias changes linearly.

If there's any curvature, we allow you to put in a C2 and a D2.

When you set up your gauge equation, if you put in all of these values,

it's possible to generate negative standard deviations

within the measurement range.

Don't do that.

If you can avoid it, don't do that.

There may be unexpected results with the add-in

if you have negative standard deviations.

Just beware of that.

What's behind the estimation computations?

Well, we start with, of course,

the actual measure part distribution and the gauge characteristics.

We choose an estimated True Part Distribution,

which seems like a good idea.

We'll start with the actual Measured Part D istribution.

Then we put that estimated True Part D istribution

through a transformation that represents the gauge characteristics,

and that yields an estimated Measured Part Distribution.

We can then compare the estimated Measured Pa rt Distribution

with the actual Measured P art Distribution on a bin-by-bin basis

and get a Residual Sum of Squares error for that comparison.

Then if we go back and adjust the amplitudes

of the True Part D istribution estimation,

we can adjust those until we get the estimated Measure Part Distribution

to agree as closely as possible with the actual measure part distribution.

We do that using a JSL Minimize function

to try to minimize the Residual Sum of Squares.

All right, let's take a look at the add-in.

Once you install the add-in,

it will come in under Gauge Study tools and TPD estimation.

This is what the add-in currently looks like, version 1.0.

We start off with the ability to choose

what type of input Measured Part Distribution do you have?

Now that varies, I'll come back to the arbitrary shape in a minute.

We also have normal,

you input the average and standard deviation, LogNormal, Weibull,

Exponential, Gamma, and two-mixture normal distribution.

We can set these up to be parametric if we want.

If you know that you have a Weibull distribution, for example,

you can use that as your input distribution.

Let's start with normal. Let's just make it simple.

Here we have a normal distribution that has a mean of zero

and a standard deviation of three.

T hat's shown in this panel here in this little graph.

Let's use a gauge, that's a very simple gauge,

has a standard deviation of one and a bias of zero.

Click Next.

Here, much like above, we get to choose the True Part Distribution shape.

We could say that's going to be arbitrary, or it could have a normal distribution,

or it could have a lognormal, all the same distributions here.

Or down here at the bottom,

we give you the option to fit all of the distributions above.

Let's again start with a simple example of a normal distribution,

and we'll calculate those results.

We present two output plots.

This plot shows the True Part Distribution in blue,

the estimated True Part Distribution versus the Measured Part Distribution,

the actual Measured Part Distribution in red.

As you would expect in this case,

we've got a normal distribution for the measured, and we've got a simple gauge.

This is one of those that we could solve by hand if we wanted to.

We end up with a slightly narrower True Part Distribution than the Measured.

If we then go to do a check on that,

we can take that True Part Distribution, put it through the gauge,

and we end up with an estimated Measured Distribution versus the actual.

That's what we get down here.

It looks like we've got very good agreement in this particular case.

Let's go back up to our gauge definition.

We'll keep the same measured part distribution.

This time we'll put in a bias of two.

We'll solve again, assuming that

our True Part Distribution is normally distributed.

We get this out.

Pretty easy to conceptualize.

Everything is just shifted over by two units.

Here's our True Part Distribution and our Measured Part Distribution.

If we put that through that gauge with the bias,

we get back to very good agreement between

the actual Measured Part Distribution and the estimated Measure Part Distribution.

Third example, let's say that...

Let's come back up here and I'll turn bias off since we've demonstrated that.

Let's say we've got the same input, we've got the same simple gauge,

but now let's say maybe we've got a Gamma distribution here.

There's Gamma, and we want to fit that.

We'll hit calculate.

This is the best- fit Gamma distribution for that input normal distribution.

You can see it doesn't fit quite as well.

Our True Part Distribution is a little bit skewed, which is characteristic of Gammas.

If you put that through our gauge, we end up with this agreement

between the Measured Part Distribution, actual and the estimated.

It's not as good a fit.

A summary of those is given here in this table.

This shows us that the first time we ran this, we did a normal distribution input,

which with two parameters, we did a normal fit on the output

with two parameters, and we got this sum of squares error.

The second time was with a bias,

and we got the same sum of squares error in the end, as you might expect.

Then with the Gamma, our sum of squares error was a little bit higher.

We get a quick little summary in this table.

There are two other

JMP data tables that are built that have all of this information,

the original distribution and the output distributions

and all the gauge characteristics.

All of those are summarized in these other two tables

to allow you to go through and make your own plots if you want to.

They are also there.

Let's do one that's a little bit different.

Let's come back up to the top and let's choose a user-defined shape.

The data table is simply a two-column data table.

The first column is assumed to be the centers of your part values,

your bin centers, if you will, in that histogram.

The second column represents the amplitudes as you go across.

Those amplitudes can be actual part counts, they can be percentages.

Anything that each bin height

or each histogram bar height is relative to the other heights.

I scale everything to make the sum of all those add up

to one within the program anyway.

As long as the relative heights are the same,

doesn't matter what the actual amplitudes are.

I give the option to open a data table if it's not already opened,

or if it's already opened within JMP,

then we can just say, select the already opened data table.

Here's an example with a square wave

or a uniform distribution for our input, Measured Part Distribution.

Now, this is a tough distribution to have.

If you think about this, if you've got a gauge

that's making normally distributed errors at any point,

it's going to be really hard to make

something that's nice and sharp and crisp like this distribution on the output.

Let's give it a try.

Let's say here I've got a pretty wide variation.

This goes from zero to 30, I think.

Let's say we've got a gage that has a standard deviation of five with no bias.

Let's say we want to fit a normal distribution to that,

and now we'll calculate those results.

Here we've got the best fit normal distribution

for a True Part Distribution that's going to run through this gauge

with a standard deviation of five to try to give us this square wave

for our M easured Part Distribution.

How well did we do?

Well, it's here.

It's not a real great shape, and you probably wouldn't expect it

to be a great fit given that we're trying to use a normal distribution

to fit a square, an I-sharp square function.

If we wanted to do an arbitrary function, let's say this one here.

This one I just made up some data.

I'll show you a little bit more about what it is.

Maybe this looks like a normal

two mixtures, two normal distributions mixed together.

Let's check that out.

Let's see if we can fit this to a two-mixture normal,

and that option is down here, and we'll calculate those results.

Here we go.

Let me run that one more time.

I don't want my standard deviation to be that big.

Let's take a smaller standard deviation.

We'll talk about that thing in a minute.

Everything else the same, calculate results.

Sometimes it takes a few seconds to come back.

It just depends on the way the routine is fitting things.

Here is our fitted True Part Distribution

compared to the Measured Part Distribution.

Assuming that our True Part Distribution

is two normal distributions mixed together.

If you run that through the gauge, it ends up looking like this.

This is the attempt to match that

and what our Measured Part D istribution would have been.

That's not too bad.

That's the way that the add-in works.

Now, there is another option here,

and that is when you fit, you can choose whatever inputs you want

for your Measured Part Distribution, your gauge characteristics.

Then when you fit down here, you can also fit an arbitrary shape.

Now, that takes on my PC maybe a minute to run.

I'm going to spare you that and just show you the outputs within a PowerPoint slide.

Here we are back in PowerPoint.

This is one other example that I've got before I get to the arbitrary inputs.

This one has a bias that I've expressed as one plus 0.03 times the part value.

I've got a linearly changing bias across the measurement range.

I have a normal distribution for my input,

and I want to fit a normal distribution to the output.

As it turns out, this is my True Part Distribution,

and this is my Measured Part Distribution.

If I run those through this gauge,

even though it's got this linearly changing bias

across the measurement range, I get very good agreement between the two.

This is what happens if I take that square wave

and say, "Hey JMP, go fit whatever True Part Distribution you want

and run it through a gauge that has a sigma of two and a bias of zero,

and tell me what that distribution might look like.

What you get out, the red curve again

is the measured part distribution, that square wave.

You get this crazy-looking thing with

all different peaks and valleys in it as the True Part Distribution.

Well, that doesn't look like any True Part Distribution that I would have,

but if you look down here, when you run that through the gauge,

it does a pretty good job of simulating this square wave distribution.

This is uniform distribution.

I believe it's working.

Now, there are reasons that it might come up with something like this,

probably associated with the resolution of the gauge that you have.

The gauge just may not resolve enough elements across the measurement range

as what you need to get a nice smooth distribution over here.

That's the idea that you can do this with this gauge.

A little about troubleshooting.

There are problems if your gauge standard deviation

is too large in comparison to the Measured Part resolution.

If you have a Measured Part Distribution,

let's say it's normally distributed with a standard deviation of three,

and you tell this add-in that my gauge has a standard deviation of five.

Well, there's no way to get, even if you have the same true part

that you measure over and over and over again, you're going to get a spread

that has a standard deviation of five, it can't fit.

You'll get some an error when this occurs.

It's up to you to figure that out that,

"Hey, my standard deviation for my gauge is way too large."

If it's simply on the verge of being too large.

Let's say your standard deviation of your measured distribution is 3,

and your gauge is 2.8,

then again, it's going to try to give you a very narrow

True Part Distribution to support that,

and that can lead to some strange results.

There are some odd combinations that I've run across

that can cause these things, I call them untrappable errors.

When you go into the JMP Minimize function,

it does its own thing and then comes back with an answer.

If it runs into a problem, it will throw an error.

Maybe like this.

I've seen two or three different ones, this is one of them.

I don't have a way to trap for those.

If you get an error like this, the add-in will continue to run,

but you'll need to look into what conditions have you put in here

that JMP doesn't like that it's having trouble solving for.

Chunky Measured Part Distributions.

If you have a Measured Part Distribution,

in fact, this one here, this might be pretty chunky.

By chunky, I mean there aren't very many bins across the measurement range.

It's related to your gauge in the end. It's how much can your gaumakege resolve.

You want to have a lot of bins across here.

More bins is better.

Fewer bins makes the true part distribution very difficult to estimate.

Then I mentioned or alluded to earlier that you can have long convergence times,

particularly when you're trying to solve for these

arbitrary True Part Distributions.

On my PC, it's not uncommon to go a minute or a little bit more.

Just hang in there. The add-in has always come back for me.

It doesn't hang. It just takes a while for some solutions.

Add-in availability should be attached to this particular recording,

and you should also be able to find it

in the JMP Community File Exchange under TPD estimation.

If you have comments or questions, you can post them either below this video

or on the File Exchange, and please put in there any suggestions that you have

for improved graphical User I nterface, any changes in the outputs.

We didn't talk about the data tables that I built,

but if you see those and you decide, "Hey, I wish it would be in this format."

Let me know.

Those are things I can change fairly easily.

Suggestions for more Parametric, Measured, or True Part Distributions.

These are the normal, the lognormal,

the Weibull, the Gamma, all of those functions.

If you have more that you want to add to that,

let me know and I'll see if I can incorporate those.

Then, of course, problems encountered.

If you can include a description of the problem, how it occurred,

that will help me in debugging.

If possible, include a non-confidential sample input file

that I can use to help replicate the problem.

And wherever you post these comments,

please include @JerryFish in your comments, so I'll get a notification.

Thank you very much for listening to this recording.

Don't forget to check out the accompanying Discovery paper,

News Flash, Gauges aren't perfect, okay, you know that.

But how much is it costing your business?

Under this particular paper number?

Thank you very much for your time.