cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Skrombee
Level I

How much data do I need to calculate control limits?

I want to create a base line control chart and then apply those control limits to new data to see if a change in the manufacturing process results in out of control or not. How much data points do I need to calculate control limits for baseline data?

1 ACCEPTED SOLUTION

Accepted Solutions
statman
Super User

Re: How much data do I need to calculate control limits?

Sorry, forgot to add, the question of "how much data you need to calculate control limits" is, as already suggested, dependent on the situation.  My general guidance is control limits are calculated when you have enough data to make operative conclusions from the study.  This requires a representative sample of the variables acting in the process.  I don't know what type of process or product you are looking at, but, for example, imagine you are making batches of a coating.  You hypothesize there is variation in coating characteristics (y's) due to insufficient agitation (within batch).  You would select multiple samples from within the batch and do this for multiple batches.  How many within batch samples you would need depends on how specific your hypothesis...if you hypothesize there would be stratification, then the sample size might be as small as 2, one from the top and one from the bottom of the batch.  If you don't have a specific hypothesis, then randomly sample the within batch component and likely increase the sample size to increase the likelihood your sample is representative.  How many batches?  Same logic applies.  If you have specific hypotheses as to why there would be batch-to-batch variation, then sample enough to allow those sources to vary in your study.  If your hypothesis was: batch-to-batch variation is due to changing lots of raw material, then you would want to sample over multiple lots of raw material...how many is a judgement greatly influenced by your confidence in capturing the potential influence off the raw material lots.  Getting a representative study of the x's that contribute to variation is a judgement best made by subject matter experts, not statisticians.

"All models are wrong, some are useful" G.E.P. Box

View solution in original post

6 REPLIES 6

Re: How much data do I need to calculate control limits?

That question is a big one. The answer is, of course, "it depends."

 

Which type of control chart are you using?

 

What is the rational sub-group size?

 

What is the frequency of sampling the sub-groups?

Skrombee
Level I

Re: How much data do I need to calculate control limits?

Sample size is always a tough question but I’m newer to this type of statistics (process control). I was planning on using JMP Control Chart Builder. I want to take one sample at 3 locations within a unit throughout the day (8 time points spread evenly across 2 shifts) for 5 production days. I’m trying to figure out how many weeks I would need to do that for to establish a baseline. I anticipate there would be both time point variability within a day and day to day variability for each location. I was reading about subgrouping. Do I need to take more than one sample per time point and location? I read about 3, 5 or 7 samples per subgroup.

Re: How much data do I need to calculate control limits?

Control charts assume the process is stable. You might consider examining the initial data using Analyze > Quality and Process > Variability Chart. You can nest the sources of variability and change the nesting interactively to look for dominant effects. You can also estimate variance components to quantify the variation that you see.

 

Also, if you suspect a difference in variability across levels of nesting, a three-way control might be helpful. It is very useful in batch processes where samples are taken with-in a batch but you want to chart batches. The within batch variation is often less than the between batch variation, so one control chart and one estimate of the process standard deviation produces unusable charts. Control Chart Builder provides three-way control charts.

P_Bartell
Level VIII

Re: How much data do I need to calculate control limits?

To add one as yet unmentioned thought to all of @Mark_Bailey 's thoughtful input is to not lose sight of the basic idea of rational subgrouping which is to construct a sampling strategy to the best of your known process knowledge such that within subgroup variation is as close as you can come to capturing the underlying common cause variation in the system...whilst the time interval between subgroup collections gives you a fighting chance at identifying assignable cause variation in a timely manner. Subgrouping is equal parts art (capturing common and special cause) and statistical method (power, etc.).

 

statman
Super User

Re: How much data do I need to calculate control limits?

I suggest you read Wheeler's book "Understanding Statistical Process Control".  You must first understand the control charts originally created by Dr. Shewhart are meant to do 2 things:

1.  Assess the consistency of the basis for comparison (this is the range chart).

2.  Compare two sources of variation to determine which has greater leverage (this is the X-bar chart),

They answer the questions; Where should the focus of work be (within or between)? and What is the nature of the investigation (special or common)?

The variation quantified/depicted by the range chart is a function of the x's changing at that "frequency".  The first question is do those x's (within subgroup) exhibit consistent and stable variation?  If so, that variation can be quantified and those sources of variation can be compared to other sources of variation (between subgroup) to determine which has greater leverage.  If the range chart exhibits inconsistent (special cause like) variation then you should seek to understand why.  If the range chart exhibits consistent variation AND the X-bar chart is "in-control", the largest source of variation is due to the within subgroup x's.  If the x-bar chart is "out-of-control" then the between sources (x's changing at that frequency) dominate.  

"All models are wrong, some are useful" G.E.P. Box
statman
Super User

Re: How much data do I need to calculate control limits?

Sorry, forgot to add, the question of "how much data you need to calculate control limits" is, as already suggested, dependent on the situation.  My general guidance is control limits are calculated when you have enough data to make operative conclusions from the study.  This requires a representative sample of the variables acting in the process.  I don't know what type of process or product you are looking at, but, for example, imagine you are making batches of a coating.  You hypothesize there is variation in coating characteristics (y's) due to insufficient agitation (within batch).  You would select multiple samples from within the batch and do this for multiple batches.  How many within batch samples you would need depends on how specific your hypothesis...if you hypothesize there would be stratification, then the sample size might be as small as 2, one from the top and one from the bottom of the batch.  If you don't have a specific hypothesis, then randomly sample the within batch component and likely increase the sample size to increase the likelihood your sample is representative.  How many batches?  Same logic applies.  If you have specific hypotheses as to why there would be batch-to-batch variation, then sample enough to allow those sources to vary in your study.  If your hypothesis was: batch-to-batch variation is due to changing lots of raw material, then you would want to sample over multiple lots of raw material...how many is a judgement greatly influenced by your confidence in capturing the potential influence off the raw material lots.  Getting a representative study of the x's that contribute to variation is a judgement best made by subject matter experts, not statisticians.

"All models are wrong, some are useful" G.E.P. Box