Live from 100 SAS Campus Drive, your one and only SAS world headquarters in Cary, North Carolina. It's The Sampling Plan Show starring Stan Koprowski. Now here, he is the host of The Sampling Plan Show , Stan Koprowski.
Thank you. Thank you very much. You're too kind. Thank you. We love you. No, that's enough. Please. All right, we have a great show for you today. Before we get started, just a little background here. I wasn't very good at statistics. In fact, I got a paper cut from my statistics homework. What are the odds? Enough of that.
Today we're going to talk about acceptance sampling and sampling plans. We're going to learn about the OC curve and what it is. I'll be honest, this is one of my all time favorite ANSI standards. We're going to hear about the ANSI standards. We'll make some predictions for the big sampling plan, and then finally, we'll show you some fantastic highlights using the JMP sampling plan added with one of my all time favorite industrial statisticians, Dr. José Ramirez.
I'm glad to be here. Proud of this show. I think it's going to be fun. S orry to hear about statistics being hazardous to your health with a paper cut. Man.
It was a rough journey there, but we'll get through it. As you see, here's the title of our talk, and then I will play some other slides here for you. Your book, I like this book. We use this a lot in the division. The Statistical Quality Control book: The JMP Companion . This is the companion book to Doug Montgomery's book. I think you're going to mention Doug's book later in the talk as well.
That's true.
Then I have your other book up here too, Analyzing and Interpreting Continuous Data Using JMP . I know you've been a long time user of JMP, probably back since JMP two. I think you were probably one of the first support cases that came into tech support with their own director of customer enablement, Jeff Perkinson. Jeff, as we all know, started out in tech support and I'll have to look back through the cases. But what I understand, you were a long time user of JMP.
Welcome to our show. I'm glad you could be with us today. Super excited to have you. We're going to talk about some options here for the folks to call in. If you want to call into the show or message, you can message us at JMP L ot Plan. The phone number, if you have a rotary dial phone, you can call us up there, or if you want to reach us on Instagram at JMP Lot Acceptance Sampling Plan. Let's go ahead and see. I think I do hear a call again. Wait a minute. Let me see who that is.
Yes, this is Dr. Julian Parris calling in. I'm the Director of JMP User Acquisition and I have a question for Dr. Ramirez.
Go ahead, Julian.
Dr. Ramirez, can you explain the difference between lot acceptance sampling plans and variables acceptance sampling plans?
I don't really believe that was Julian. That was probably Julio from down by the school yard. Julio, do you have a question?
Yes, he did. He was asking if you could give us a little introduction to sampling plans.
Sampling, okay . He's doing some sampling by the schoolyard. I wonder what he's doing there. Julio, since you're in a school, you're by the schoolyard, let's go back to the dictionary and look at sampling, what the dictionary says. There are a few definitions here in the dictionary, and one thing I like about the first one it talks about a suitable sample. For statisticians that has the meaning, part of that is there, the sample has to be representative. For those of you familiar with stats, the other piece that we also include in a definition of suitable sample is random.
But what you're talking about, about these sampling plans is the second definition, and they got it right because a sampling plan is essentially deciding on a small portion of a lot or a population. The population can be either infinite or it can be a finite number like 10,000 items. We want to take a small sample from that so we can do some inspection. By inspection, we mean that we are going to decide the fate of that population or the fate of that lot.
Okay, I understand a little bit of what you're saying there. You're going to take a sample from a population and then what's the difference between the sample types? It looks like you were talking here about an inspection. Is it always an inspection or are there other types of sampling that you can do?
Well, in general, when people think about lot acceptance sampling plans, there's some type of inspection, some type of checking that goes on. T he way we're going to do this sampling is we're going to apply some statistical principles [crosstalk 00:06:22].
Oh, gosh, that's scary.
But actually that's what the agencies want. For example, if you look at some of the documents from the FDA, if you're required to do some type of sampling, they want you to do it in a statistical way. How are we going to do that? That's part of this show that we're having. But in the old days, and we're going to talk about those old days, people used standards, and those are the standards that are used right now. The ASQ/ANSI Z1.4 and Z 1.9.
That sounds even scarier, but I'm a bit confused. I thought when I was doing my research before I brought you on as a guest, that there were some military standards. I thought the sampling plans were based on military standards and now you just told me there's some other standards in play here. W hat's the story with this?
Well, yeah, that's true. These sampling plans have a long history. They go back to probably the 1930s, 1940s, and there were two military standards, the 105 E, which is the one that corresponds to the Z 1.4, and that's what you're doing using a technical term, sampling by attribute, and then there is the military standard, 414, which is the Z 1.9 that corresponds to sampling by variables. You can think of discrete versus continuous.
What happens is that those standards were taken over now by the American Society for Quality, that's ASQ, and the American National Standard Institute, the ANSI, and they rebranded those, and now they're called Z 1.4 and Z 1.9. But they're still the same standard, the same table that people use to generate sampling plan.
L et's talk about how we do that. The way to understand this is that every type of lot acceptance sampling plan, and that is the L-A-S-P or LASP, has components and risks. T he components of the plan essentially is how many samples do I need to take from a population of 10,000 or infinity that I'm going to test? I'm going to do some inspection, and I want to know if in that sample, there is a pre- specified percent defective.
We're going to define quality as a percent defective in the population. What we're doing is taking a sample to see if that sample contains that or less. In order to determine if we're going to pass or fail that lot, we also relied on the acceptance number, which tells us how many defective parts out of the sample we can accept in the plan. Anything that is greater than that, it's going to say we're going to reject or fail the lot. That's why, as I said before, we're determining the fate of the lot with this [inaudible 00:10:03] .
Here we go. I had my fate a long time ago with that paper cut. I'm a little anxious here. What kind of fate are we talking about?
Yeah, we're going to decide if we are going to pass the lot and make it available for whatever purpose that lot is being manufactured, or we're going to put it in quarantine [inaudible 00:10:31] and maybe do some more inspections or trying to understand why the fraction defective is larger than the prespecified one.
You mentioned that there's some risk involved. What kind of risk? Is it out of my control or is it within my control? Whose risk is this?
Yeah, there's risk, and that's the beauty about my profession. In statistics, you don't have to be certain about anything. I can 95% confident. I can even go up to 99% confident, I don't have to be certain. Again, for those of you familiar with statistics, you know what we're talking about. There's the chance that there's a perfectly good lot that we're going to reject because, again, we're taking a small sample, so we're not sampling the whole population. There's a risk associated that this sample may determine that a good lot is not good or there's a chance that a bad lot may be released. Y ou can think in terms of false positive. Y ou think in terms of a medical test.
That's similar to like if you were to take the COVID-19 test. If you actually had it and it came back negative , is that a false positive?
Yeah, it's like that, a false positive. We may have a false positive that means this lot is bad, but the sampling plan that it was good. Or vice versa, we have a good lot and we're going to reject it because the sampling [crosstalk 00:12:19] . That's why it's important to use a statistical principle in designing these sampling plans. That's part of why people use standards. Those were derived in a way that when you select a plan from those standards, these risks are going to be balanced. That's what we talk about the generation of the lot acceptance sampling plan.
We have these two risks. Again, we can say that a good lot is going to be rejected and there's also a chance that a bad lot may be released like the false positive. We're going to assign some probabilities to this risk. One thing we want to make sure is that if it's good, we want to accept that most of the time. How do we define that?
Well, standard practice is used 95%. A gain that sounds like 95% confidence when you use some type of statistical test. What we're saying is we're going to pre define a fraction defective for that lot and we are going to select a plan that guarantees that or almost guarantees that 95% of the time, a good lot is going to be accepted. A good lot is going to pass.
On the other hand, we're going to also say that we're going to have a high chance of rejecting a bad lot. Or if you flip that, you can say there's a small chance of passing a bad lot. 90% here transforms itself to a 10%. We're saying there's a 95% of accepting a good lot, but only a 10% chance of accepting a bad lot. Those are standard numbers in sampling plans, in the standards and the way people use those sampling plans.
You were talking about the user. The user has the option of changing those numbers. Rather than use 95%, we can use 99%. Rather than using 10%, we can use 5%. However, if we get too greedy, then as some of you may know, the sample size is going to increase and that's part of that balance. We want to find a sample size that is small enough that it's going to guarantee these probabilities.
Should we do some examples? Are there things we could do to explain some of the balance that you're talking about between these two competing risks?
Yes.
It sounds like one might be on the consumer side, and maybe one of the risks is on the producer's side.
Exactly. The first chance of ... If you look at the 95% chance, that means there's a 5% chance that a good lot is going to be rejected. That's on the producer side. Because as a producer, you don't want your good lot to be rejected. The loss of material or good product.
On the other hand, the 10% chance of accepting a bad lot, that's on the consumers because we don't want the consumers to receive something that is bad. Now, in order to do this, there's this tool that we use in sampling plans which is called the operating characteristic curve or OC curve. Some people may be familiar with that. Those curves, I'm going to show you some examples, are the ones that are used to find that balance between this risk and probability. Risk and probability.
What is an OC curve? An OC curve is essentially a plot that shows you the probability of accepting a lot as a function of that fraction defective in the population. You look at these two curves, the blue line here on the Y- axis, we have the probability of accepting the lot, and the X- axis, we have the proportion defective in the population.
As you can see, when the proportion defective is very small, there's a high probability of accepting the lot. A s soon as that probability starts increasing, then the probability of accepting that lot decreases, and that's the shape that we want to see in an OC curve.
Now, remember we have two probabilities and two risks. We want a 95% chance of accepting a good lot and a 10% chance of accepting a bad lot. Now, the definition of good and bad is in terms of that proportion defective in the population. We as users of sampling plans, we have to define what is a good fraction defective. Of course, ideally, the good fraction defective should be zero, but that will throw a [inaudible 00:17:22] in the math. If you know that zero divide by zero, you get infinity, meaning you have to sample everything if you want perfection. What we do is define a small number, and that is called the AQL, the acceptable quality level.
On the other side, we define the RQL or rejectable quality level. Granted, there are many terms that people use. Sometimes you may see the LTPD instead of RQL. That's the lot tolerance percent defective, the maximum percent defective that you can accept in your population. With the probabilities and those fraction defectives, you define two points in this curve. H ere, the AQL is 1.8%, and we have a 95% probability of accepting that.
What that means is, as long as the fraction defective in the lot is less than 2%, let's say, then there's a high chance of accepting that lot. But as long as it goes higher, like 2%, then we're going to accept that lot very infrequently. These two points are the ones that you look at in the OC curve, and those are the ones that are going to determine the sample size and the acceptance criteria.
Got it.
I think it will help Julio if we run some example.
I think Julio is texting and he said that would be great if we could do an example. Okay, let's do that. Why did you develop or why did we produce a JMP add-in?
Yes, Julio, what we're going to do is we're going to show you an app to do this. People sometimes ask us, "Why did you do this? Why do you spend a lot of time for writing code and package it in an add- in?" Well, to tell you the truth, [crosstalk 00:19:30] .
That is really tiny. I'm going to have to get a new set of readers or a giant magnifying glass. Is that really how people do this?
Now it may seem weird, but still, I believe in some industry, people may use a standard. Granted, it may not be the old book that they use, maybe a pdf, but sometimes they still use this. But these are very tedious and they're discreet in the sense that they're just approximations to the plans.
T here's a process to do that. Of course, there are multiple tables that you have to go through in order to find the appropriate sample size. W hat we wanted to do is make our life easier, actually, because to be truthful here, we use sampling plans. We're also tired of using these tables. So we wanted to automate the generation of the LASPs.
Okay, got it. What else do we need to know before we do some examples?
This is JMP, which is one of the greatest pieces of software out there for doing statistics and getting insights out of your data. A nother thing that we did is that not only we're automating the generation of the plans, making life easier, but we're using all the visualization tools in JMP, like the profilers, to understand these OC curves. Remember, we just showed you that the way we determine or generate the plans is via an OC curve. The OC curve is also very important to evaluate plan. How do we know? If someone gives us a sampling plan, how do we know if that's good? T hat's part of this.
Let's look at an example. T his comes from Professor Montgomery's book. A gain, here a shameless PR for us. We wrote a companion book. You saw that book at the beginning for this. It's called JMP Base . I f you look at this figure, this is in chapter 15, P rofessor Montgomery also shows another approximation. It's another way of generating sampling plan, which is using a nomo graph. A nomo graph is this figure that you see here. Here, you have to figure out what your AQL and RQL are, the probabilities, and that you have to go in there and approximate that.
Here, they give you an example where they say the acceptable quality level is 2% or 0.02 , the rejectable quality level is 8% or 0.08 . The probability of accepting a lot that has an AQL of 2% of less is 0. 95. As we say, that's high. T he probability of accepting a lot that has RQL of 8% or more is only 10%. A gain, these 0. 95 and 0.1 are our standard value in industry.
Y ou notice this diagram here, they put the 0.02, they draw a line here to intersect the 0. 95. The 0.08 intersects at that, and then you intersect those two lines and you guess at that. Okay, that's a 93 plan. Meaning, out of an infinite population, you take 90 samples, you inspect them all, and you can accept up to three defectives. If you see more than three defectives, you reject the lot. If you see three or less, then you accept the lot. T hat's sampling plan. A little cumbersome too .
Yeah, it looks a little difficult to line up exactly. I hope we could get away from doing that with the add-in.
Well, let's show Julio how we can do that with JMP.
Okay, sounds like a good idea. All right, Julio, let's jump over to some examples here. H ere is JMP and to get to the add-in, you just go to your add-in, JMP Sampling Plans, and let's look at attribute sampling plans. T hen this particular one, we're going to do just a single lot acceptance sampling plan. W e have the menu.
After we make that choice, it comes up and it gives us three options. We can evaluate an attributes plan, we can generate or create an attributes plan, or we can compare plans. The add-in gives us the ability to compare up to five different plans. There's an option to keep the dialogue open in case you want to look at more than one plan. L et's go ahead and generate that plan that you were just sharing here from Dr. Montgomery's book.
I'll give you the numbers. Let's see.
All right.
This is the interface. [inaudible 00:24:51] AQL.
Right. Y ou have the couple of different sections in the interface. You get to put in your quality levels, your probabilities, and then you have the optional area about the type of lot sampling you're going to do. T hat lot sampling is based on a distribution. S o we could either do a hypergeometric distribution or a binomial distribution.
Lets put Montgomery's numbers there. The AQL in Professor Montgomery's book is 0.02 or 2%. The RQL he has is 0.08.
All right.
Oh, they're pre- populated. Yes, those are the standard values, 0.95.
The standard values. So we keep the default there. I think you told me there was no theoretical lot size here. So we're going to do type B.
Actually, in Montgomery's book, it says that he's using a binomial nomograph, so it's meaning he's using the binomial distribution. So yes, that's the right choice.
Then all we have to do is hit okay, and then what happens? JMP gives us this curve here and an output window and a report. T here's three sections to the report. You get a little at the top of summary, you get a summary of the plan. I t shows you your input parameters and then what the plan generated as a sample size and acceptance number. T hen it gives you some information on how to interpret the plan.
That was helpful.
O ut of the plan recommendation here, of the 98 samples, it says you can accept the lot here as long as the number of defectives is less than or equal to four out of that 98 sampled. O therwise, you reject it if it's greater than that. T hen it tells you some additional information there. I f we look at that OC curve, I think this is what you were showing us earlier.
Yes.
So we have a probability at 95% and the quality level of 2%. But if I recall, you told me in the literature that this was a plan of 90 and 3. So I'm confused again here or, should I say, Julio texted me and said he's confused. How is it different here? Or why is this different?
We're getting 98 and 4. Professor Montgomery, in his book, he's showing the nomo graph and he's getting 90 and 3. Let me go back there just 1 second. R emember, what we're using is this graph here where you have all these approximations. You have a line that goes 200, 300, you go between 70 and 100.
Got it.
There's not really a 93 or 94, anything like that. A s I say, you had to approximate these. T his nomog raph is just an approximation. What we're actually trying to do is solve these equations for those of you mathematically inclined. The one minus alpha is the 0. 95, the beta is the 0.10 , p1 is the AQL and p2 is the RQL. I f we put all those four things in here and we solve these equations to try to find the minimum n that satisfied those four things. W hen you do that, actually, software does that for us, the code that we wrote, we get the 98 and 4.
I get that.
The moral of the story here is that the nomograph gives approximate lot acceptance sampling plan again, because it's just an approximation game. T his is again one of the advantages of using the add-in, because you get-
Sampling plan add-in, of course.
-more exact sampling plan. Y ou show that we can evaluate a sampling plan. L et's do that. Why don't we evaluate this plan, the 93? Show us ho w to do that.
Since I left that window open, we don't have to go back to the menu again. L et's evaluate a plan now. W e have in that interface, again, we're going to put in our previous 0.02, 2%, and I think he told me the RQL was 8%.
Yes.
This time, though, instead of a sample size of 20, we're going to do 90 and evaluate three. Again, it's a binomial.
Here, you're entering the four quantities that we talked about in the generation, but you also enter the actual sampling plan that you want to evaluate.
Exactly. Now when I say, okay, it's going to look... Sorry about that. I'll j ust redo that quick.
[inaudible 00:30:31], no?
Yes, sorry. I want to evaluate that plan, you get 90 and 3.
Three, yes.
When I rerun that, we get the exact same style of report, but the information is slightly different, I see here. I noticed also down in this table, there's some color coding and direction of the arrows.
Yeah, I see some red there. I see some red, yes. That's an issue.
Is this an indication that the specified quality level is better?
Actually, no. I think that the reason that it's red is because what happens here is what this is telling us. If I use a sample size 90 with an acceptance number of three, you can see the associated probability of acceptance is 0. 89. Remember, we wanted that to be 0. 95.
So it's actually lower.
It's lower and that's why that is red. T his add-in is giving you a signal that your probability of acceptance is less than the one you specify and that's an issue. T hat's why the 98 and 4 is a better plan. A lso you can see at the bottom that the probability of acceptance for defective lot is 6% rather than 10%. I n that case, it's blue because it's better, that probability is better. Y ou're going to have a smaller consumer's risk. You can see that the n is 6.47 versus the 10. 67. T hat is what's happening.
That producers risk is really just the difference between that one minus the associated probability of acceptance there?
Exactly.
Got it.
Exactly. T he producer's risk, we want it to be 5%, and in this case, it's 10%. No w this is something that I haven't seen in any other software. Normally, what you see is the plot on the lot. You get that for the associated AQL and all that, and they assume that the probabilities are the ones that you specify, but they're not.
W hat the add-in does too, is it flips things around and says, okay, if rather than fixing the AQL and RQL, I fix the probabilities, I want to be 95 because that's what it is, I'm neurotic that way. I want it to be 95 and 0.1. Then what are the corresponding AQL and RQL for that? T hat's what that's telling me.
Okay. I wasn't really just seeing double. It's really a different calculation on the right hand side. Got it.
What that says is if the probability of acceptance peaks at 0. 95, then the AQL is not 2% but 1.53%.
Okay, got it.
It has to be way less than that.
So it's a... Okay, go ahead.
A lso at 0.1, then it's not 8% but 7%.
Okay. 7.3%, roughly.
That's why both cases, they are blue. They are blue. A gain, just very quickly here to show this. In summary, for this one is that, again, the nomogram gives an approximate lot acceptance sampling plan. The 90 and 3 plan shows you that we're not hitting that.
Are you sharing?
Yes, I'm sharing.
Okay.
Hopefully, people can see that. Our producers risk is now 10% versus 5%. But there is one more thing that you had there, which is compare. I'm curious, can we use that? I'll tell you tell can tell you what I want to do. I want to use the add-in to compare the planning Professor Montgomery's book of '93 with a plan that the add-in gave us, which is 98, 4
All right. W e did mention that we can compare up to five different plans. Y ou want me to compare the two plans that we just created. All right. Again, that 8% RQL. In this case, we had a 90 and an acceptance of three.
Yeah.
Then I think you told me it was 98 and 4.
That's what they do. I didn't tell you. That's what the add-in gave us. Yeah.
Wow, that, that is pretty slick. A gain, if you wanted to do more than two, you would just check the additional rows and then the add-in will calculate up to five different comparisons here. I'm going to go ahead and click OKay. N ow it looks a little bit different, Jose, here. Now I'm seeing two OC curves. I'm getting a comparison of both of my OC curves on the same plot I see here.
Exactly. H ere the blue one is the plan 93 and he red line is-
T he blue is the 93. Yeah, exactly. I see that.
The red line is the 98, 4 . Y ou can see that the red line or curve, is on top of the blue curve. T hat's what you want to see. You want the curve to be on top, literally. T his shows you that you have higher probabilities of acceptance according to the parameters, the IAQL and RQL that we prespecified with the plan 98, 4 than with the plan 93.
T his is very helpful because you may be in situations where you may get an approximate plan from a book or someone may suggest, "Hey, why don't you use this plan?" With this one, you can compare them all and see which one is better. I t's easier to negotiate the sample size using these tools than just getting into an argument and say, "No, we should use 93 because that's in the book or something like that.
"I think that's a great feature to be able to compare, and that way you move the discussion into some actual information and rather than subjective, individuals can now compare directly. Ag ain, if you want the reference lines, it didn't make sense to show all the reference lines across five different curves.
W hat we've done is we could just display a single set of reference lines by toggling the filter, and it will update the graphs for you as you toggle between them. That way, if you do want to see that individually, you could do that, and then you're just focused in on the table and the graphs for that particular set of observations. That's, I think, a nice feature, Julio.
It is.
All right. It's time for one more. [inaudible 00:38:32].
I think that Julio is probably getting tired there.
He is very exhausted.
Julio, there's some other things that the add-in can do. So maybe it should be a follow up to this.
Maybe we could do another session for Discovery Summit Japan or Discovery Summit China.
We'll continue showing. [inaudible 00:39:05].
I think I'm getting breaking news coming across here. Just into the news center here of the show. A nyone, it looks like, can get that sampling plan add-in. If you just go to the JMP user community and search for the sampling plan add-in, you could download it yourself.
And that's free. No? That's free?
Absolutely free. We would never charge for that on the community. G o ahead and download that. If you have feedback or you run into issues, feel free to message me, and we'll get those defects entered and get a corrected version out there as soon as we can. Other things just before we wrap up here, I just want to say thank you to Jose. Thank you for joining us on the JMP sampling plan show today. It was great to have you here. Really great. T hank you again.