More a thread to think about how to best describe variability than a pure JMP question.
The interest comes with a background from the semiconductor industry, but has applictions elsewhere. What metrics should be used to describe the variability over a part, a wafer in this case.
The first answer is always variance/standard deviation, but here is a bit more background. When we judge the results of a process we typically take lots of measurements across the wafer. The locations of those points are NOT random. Typically either a series of points across a diameter or a concentric series of rings are taken. There are often real limitations to where measurements can be made and there is really no options to change them.
The second complication is that for the most part the processes have very systematic trends across the wafer and a random noise factor on top of that. Patterns are often center high/ edge low or sometimes more complex but predictable.
Currently two metrics are used: standard deviation or Max/Min (range). I have looked at results for many decades now and find either metric wanting. Typically a standard deviation predicts a much larger range of results than is actually seen and the Max-Min techniqueactually doesn't even capture the points measured. Side Note the metric is Max-Min/2Average, but if the average is not in the center of the range, then this metric will not include the most extreme values.
The central limit theorom says the non-random point and systematic variations is okay in calculating the means, but it really doesn't address the variability or range of measurements.
Should I just stop worying about this or is there a better solution? One reason for the concern is that in doing optimaztion calculations the two common metrics give two different set of optimal conditions.
A bit more information might help - such as how large your samples are. From your description that the standard deviation implies more variability than is observed, it sounds like the distribution is not nearly normal. Looking at the kurtosis might be worthwhile. If the sample size is not large, then I suspect the central limit theorem might not apply here - especially if the distribution has high kurtosis and/or is very skewed. If your sample size is large, perhaps you should try fitting some distributions to the measurement data. Depending on the distribution, other measures may be more appropriate for describing it.
Here is a bit more on the problem (opportunity). We are monitoring a process line and the performane of the tools and we need a few metrics to discribe the performance on a sample and in this example it is a wafer. Typically we have a couple of different ways of sampling the wafer usually either with a diameter scan or a contour map. The diameter will have about 50 points and the contour between 9 and 121 is not uncommon. The longer the measurement takes the fewer points you get just to manage throughput of the measurement tool. Now we need to come up with some metrics to describe the result with the first of course being the mean and that is straight forward. Now how to describe the variation across the wafer usually either standard deviation or Max-Min is used.
From a physics/chemistry/tool mechanics perspective the mean is typically a slower moving metric and performance deviations are typically seen in the variation across the wafer. So how do we best describe it simply and preferably in a consistent way across the multiple measurements taken as the wafer is processed. For some additional perspective it is not unusual for a wafer to have 500+ processing steps to complete and seeing a very large range of process and measurement tools.
My interest is not to find the silver bullet, if one existed the semiconductor industry would have surely adopted it by now. But rather to ge thoughts and ideas. I have attached a couple of examples of wafer data so youcan get a sense. The files represent data from a single wafer, how would you describe the variation across them?
I don't have enough background on this process to say much, but looking at your data, I think the geographic dimension makes simple distributions an inappropriate way to proceed. For example, the measurements in the diameter example show clear drop offs at both low and high x values - I assume this is expected and relates to the edges of the wafer. Then, I think the meaningful variation would be in the middle - and the pattern there seems quite systematic. In fact, the variation is probably expected and the variability you would want to measure is around the expected pattern that is systematic. I think this calls for more modeling of the systematic pattern and then measuring the variability around that.
The contour data is more complicated since the geography varies in two dimensions. Using graph builder to plot the X,Y coordinates and overlaying the data provides a pattern. Like the diameter data, I suspect there is some regularity to what is expected (although plotting this data looked quite irregular to my untrained eyes). If you model that regularity then I think you would want to measure variability from that expected pattern.
I don't know if this helps at all, but I'm afraid I don't know enough about how the production process works to say anything more intelligible.