Just joined the community today and had a question about control charts. I am collecting data that consists of 120 measurements taken once a minute. My goal is to understand the variation between the measurements (between 1 and 2, 2 and 3, etc.) and I have the measurements in groups for each minute of time (120 measurements for 12:01, 120 measurements for 12:02, 120 measurements for 12:03, etc.). Based on what I am trying to understand and the way in which the data is collected, which control chart would be best to use? I tried searching a few discussions on this, but was still confused whether my data is continuous or is actually divided into sub groups.
invtanofmark: Lots if issues to consider, in no particular order;
1. When you say, 'measurements' I'm presuming some type of continuous variable vs. attribute or count variable? The answer matters wrt to chart type.
2. Can you create rational subgroups of observations? If no, then a chart of individuals is required. If yes, then some kind of variables chart (xbar/R perhaps?) with subgroups is called for.
3. Are your observations independent of each other or do you think there may be correlation among successive observations of some lag? I might check with JMP's Time Series platform to see if there is some autocorrelation among observations? With independence assumption, then the variables chart approach might work. With correlation, you'll want to consider some type of time series oriented chart such as an EWMA.
4. Is this a Phase I or Phase II type investigation? From your first post, it looks like Phase I? So I might just start with an I/mR chart and see what it tells you?
5. Lastly, how stable is your measurement system? Do you have control charts in place there? If not...how do you know the data you've collected is due to process vs. measurement system variability?
For some 'how to in JMP' and thought provokers I suggest taking a look at the Mastering JMP event I host:
Thank you for responding. Based on what I understand here are a few answers:
1) The measurements are being taken continuously by a scanner. The
measurements are taken by a scanner across a width of 200" on a sheet of
paper. A total of 120 measurements are taken over the course of 1 minute.
The measurements are taken to see how much variation exists across the
sheet of paper. So, 120 measurements are taken of the same feature within a
2) In regards to the sub groups, this is where I am confused. I would think
that each minute of time would represent a subgroup of measurements. Say at
12:01 120 measurements are taken, then at 12:02 another 120 measurements
are taken and so on. I am not interested in the changes between the
measurements as time goes on, but instead interested in the changes between
two measurements at the same time. I don't care if there is a difference
between the measurement taken at 12:01 at point (1) and the measurement
taken at 12:02 at point (1). I do want to understand the difference between
the measurements taken at 12:01 between point (1) and point (2),
3) All the measurements are taking the same reading, just at different
points across the width of the sheet.
4)This is a phase 1 analysis. Just trying to get an idea of where we
5) Control charts are in place, but it is uncertain how often they are
actually used to manage the process.
On Mon, Aug 31, 2015 at 10:19 AM, email@example.com <
Thanks for the additional information. In the interest of full disclosure, for many years I worked for a major photographic materials manufacturing company..."You Push the Button...We Do the Rest" was one of the company's marketing slogans many, many years ago. So I have some background in process monitoring on a moving web at high speed for various continuous (coating thickness for example) and attribute (lines and streaks) data types. Sometimes the data was captured real time via optical scanning systems...other times off line via product testing.
Here are some more thoughts;
1. The idea behind forming rational subgroups is that by setting up a subgrouping strategy, you should endeavor to create a subgrouping strategy that captures process common cause variation within subgroups, with a goal of giving you a fighting chance of finding assignable cause variation between subgroups. Quite frankly I'd be hesitant to use one minute time intervals as a basis for forming rational subgroups. That's what I'd call a convenience subgroup...not a rational subgroup. I really think you are in single measurement in time land.
2. Variation across the sheet of paper SCREAMS at correlation among the positions from across the sheet AND down the sheet, especially if there is some kind of extrusion or coating going on, and I'm presuming the sheets are being cut from a web...Have you checked for that? It sounds like you are gathering data at specific widthwise locations? If there is evidence of correlation among the locations, maybe some kind of multivariate chart approach in JMP might be appropriate? Indeed in my On Demand Webinar I offer just such an example.
3. In phase I I'd take a univariate and multivariate distributional (shape, outliers, anything else suspicious?) and second by second (run chart) view and see what you can discover...this should help you lead to the appropriate control charting method.
Regarding 1) it sounds like you are saying that the reasoning behind forming subgroups is to understand variation between batches of things. Such as to compare one batch of paint to another made at a different time? In my case, would it be better to look at one set of 120 measurements taken at a 1 second interval? Looking back, I think my reasoning for dividing the scan collections into groups was to try to see if the variation between measurements is increasing with each cross-direction scan. I don't know if this is the correct way to approach this type of variation.
2) I agree with you, there is certainty a correlation between the measurements in both the machine direction and cross direction of the web. For this particular process the sheets are being cut from a web. Could you explain more about the multivariate charts or if possible provide a link to your webinar?
3) In regards to the run chart, I am struggling with understanding how large of a sample size should I plan to collect. Would I first need to use the Sample Size and Power Tool? In the mean time I will collect some more data in the range of one second over 30 seconds to build the run chart.
Subgrouping is what I'll call 'agnostic' to batch process behavior...unless your practical guide is to treat batch to batch differences as KEY to your practical process monitoring information and decision making approach. Rational subgrouping (size of subgroup and frequency of subgroup creation) is a sampling strategy to get your best estimate of the magnitude of common cause variation within a subgroup, AND an eye towards maximizing the probability of detecting the presence of assignable cause variation between the subgroups in a timely fashion for your practical needs. That's it...nothing more, nothing less.
Using the JMP sample size and power tools are usually associated with statistical inference...not process monitoring...I would not use traditional sample size and power methods UNLESS your problem is REALLY one of statistical inference...not process monitoring.
Now having said this...the literature is full of schemes for establishing sampling strategies for process monitoring. Elements to be included are things like ARL (average run length), alpha and beta risks, and differences to detect...and sampling schemes (which almost always are influenced primarily by budget issues!). Ultimately these schemes usually use simulation methods to compare and contrast different scenarios. These methods are way to complex to explore via a online forum such as this.
There is no one size fits all method or workflow for establishing a sampling strategy for process monitoring. To channel my inner Shewhart, he may have said something like: "We seek a method by which we can achieve and economic balance between the probability of making one of two mistakes...going to look for trouble when none truly exists...and failing to look for trouble when it indeed exists." I think the key phrase there is "economic balance".
How you define the group depends to a large extent on what you are trying to detect:
- If you suspect there is a strip of the sheet that is consistently different (always thicker, or always darker) (e.g. 18" to 20" on your 200" sheet), it would be sensible to use and XBar and R chart with the sampling location as Group. You would have 100 groups/100 locations, and Group 10 would show up as out of control on the XBar chart.
- If you suspect there is a strip of the sheet that is more variable (thickness or color varies too much) (e.g. 18" to 20" on your 200" sheet), it would be sensible to use and XBar and R chart with the sampling location as Group. You would have 100 groups/100 locations, and Group 10 would show up as out of control on the Range chart
- a XBar and R chart with Minute as the group will tell you if the roll is changing over time (not what you are interested in, if I understood you correctly)
- since you have group, Individual Range chart or Run chart would not be my first choice.
Ok, I think I am starting to understand regarding the charts. I have also been reading some of Thomas Pyzdek's coverage of control chart concepts. According to Thomas, IM-R charts are used in the case when you are really only taking one measurement. So if I wanted to look at the variation between readings of Box 1, I could use an IM-R chart to compare measurements taken over time. Instead, I am really wanting to look at how to measurements vary in the cross direction (as you said 18" in to the sheet, 20" in to the sheet and so forth). I think for this case I will use the Box locations (1-120) as the sub groups and then use 6-10 random sample times as the data seen in each group. Did you pick 100 groups as a general number or do any specific calculations to arrive at this?
I have been in the paper industry for many years and have struggled to understand the use of Process Behavior Charts (Control Charts) for data generated by continuous scanners that move back and forth across the sheet as the sheet is moving perpendicularly to the travel of the scanner. One of the biggest problems with continuous measurement by any on-line intstrument is the issue of autocorrelation. If the measurement frequency is faster than the ability of the sheet properties to actually change in a meaningful (practical) manner, the autocorrelation is going to KILL you. This is evidenced by very narrow control limits on your Control Chart compared to the ups and downs of the individual data. The reason is because the average moving range is very small due to the autocorrelation. Thus, very tight upper and lower control limits are calculated. I worked with a Black Belt candidate a few years ago on this very type of problem. The answer was to randomly sample the autocorrelated data at some reasonable frequency in order to construct useful Process Behavior Charts. As alluded to already, there is also the issue of cross-machine variation.
If I can be of further help to you, please feel free to contact me. (firstname.lastname@example.org)
The suggestion to use random samples sounds like it would be the way to go . I have found out that although the scan is measuring at 120 points across the sheet; the output number that the operators see is actually the average value of the scan (for all 120 measurements at one time interval). However, it is the amount of variation between the points that is the concern. Say at one edge of the sheet the measurement is 55, it is not an issue if the next measurement is 55.5 or 56. However, if the measurements jump from 55 to 68, this is a problem. It is even acceptable to start with 55 at one end of the sheet edge and finish with a measurement of 70 at the opposite end; as long as the measurements have a small amount of change between them such as 55, 56, 57, 57.5 and so on.
Also, I would like to complete a capability analysis to see how effective the current process is in staying within customer specification levels. Are there any problems I should watch out for with using this kind of data for a capability analysis?