Level: Intermediate
Hadley Myers, JMP Systems Engineer, SAS
Chris Gotwalt, JMP Director of Statistical Research and Development, SAS
The calculation of confidence intervals of parameter estimates should be an essential part of any statistical analysis. Failure to understand and consider “worst-case” situations necessarily leads to a failure to budget or plan for these situations, resulting in potentially catastrophic consequences. This is true for any industry but particularly for pharmaceutical and life sciences. Previous work has explored various methods for generating these intervals: Satterthwaite, Parametric Bootstrap and Bias-Corrected (Myers and Gotwalt, 2020 Munich), and Bias-Corrected and Accelerated (Myers and Gotwalt, 2020 Cary), which were all seen to have error rates that were too high for the small samples typical in DOE situations. Therefore, we make use of the new “Save Simulation Formula” feature in JMP Pro 16 in an add-in that improves upon these by allowing users to perform a “Bootstrap Calibration” on the Satterthwaite estimates. The add-in also includes the ability to do this for linear combinations of random components, taking advantage of another addition to JMP Pro 16. Further, we investigate a new version of the fractionally weighted bootstrap that respects the randomization restrictions of variance component models, as an alternative to the parametric bootstrap, using the new “MSA Designer” debuted at this conference.
Transcript |
Hello, my name is Chris Gotwalt. My co-presenter Hadley |
Myers and I are presenting an add-in for obtaining improved |
confidence intervals on sums or linear combinations of variance |
components. This is part of a series of talks we have given as |
we work on improving and evaluating several approaches. |
Obtaining confidence intervals on sums of variance components |
is important in quality because it provides an uncertainty |
assessment on the repeatability plus the reproducibility of our |
measurement system. The problem is that when we ask for a 95% |
confidence interval, there are approximations involved and the |
actual interval coverage can be as low as 80%. |
In our previous studies, we found that two methods have |
improved coverage rates, parametric bootstrapping and |
Satterthwaite intervals, but it was still less than 95% in small |
samples. The earlier version of the add-in implemented the |
parametric bootstrap as a stopgap and Elizabeth Claassen |
implemented the Satterthwaite intervals in the Fit Mixed |
platform natively in JMP Pro 16. I want to stop here to give |
Elizabeth Claassen credit for making interval estimation of |
linear combinations of variance components so much easier and |
JMP Pro 16. She also greatly extended the Mixed Model output, |
which has made this presentation vastly easier. I'm also hoping |
that this presentation will serve as an inspiration to |
others to check these new Save commands out so they can get |
more from JMP Pro’s Mixed Model capabilities. Now we're going to |
combine the two approaches using a technique called Bootstrap |
Interval Calibration that was introduced by Loh in a 1991 |
Statistica Sinica article. Bootstrap Calibration is a very |
general procedure for improving the coverage of confidence |
intervals that can be applied to almost any parametric |
statistical model. I'm going to introduce the basic idea of |
Bootstrap Interval Calibration in the simplest terms that I |
can, and hand the mic over to Hadley, who's going to demo the |
add-in and discuss our simulation results. To make this |
simple, let's make it specific. Consider a very small nested |
Gauge R&R-type study where we want to estimate the total |
variation. We collect the data and run a nested variance |
components model with an Operator effect, a Part within |
Operator effect, and a residual effect. The software reports a |
Satterthwaite-based interval on the total. It's well known that |
this is an approximation that assumes a “large” |
amount of data is present in order for the actual coverage of |
the interval to be close to 95%. |
In small samples, the actual coverage, the probability that the |
interval procedure generates intervals that actually contain |
the true value of the estimated quantity, will tend to be less |
than 95%. Thing is the actual interval coverage is a |
complicated function of the design, true values of |
the functions, and a long list of other assumptions that are hard |
or impossible to verify. What we can do though is used the fitted |
model and their parameters to do a parametric bootstrap. When we |
do this, we know the true value of the quantity we are |
estimating because we were simulating using that value. |
We can do the simulation thousands of times. We apply the |
same model fitting process to all the simulated samples. We |
can collect the intervals from JMP and calculate how often |
they contain the generating value of the quantity that you |
were interested in. In this case we were interested in the sum of all |
the variance components, so the true value is 4.515. |
Suppose we took our original data set, took the estimates, |
use the Save Simulation Formula that is comes from Fit Mixed, |
and generated a large number of new data sets, and applied the |
same model fitting process that we applied here to each of them, |
and we collected up all of the confidence intervals that were |
reported around the total. After having done this, suppose |
that that...the estimated coverage, the estimated number |
of times that these intervals actually contained the truth, |
turned out to be 88%. |
So we wanted that 95% interval, but the Bootstrap |
procedure is telling us that the actual coverage is closer to |
88%. Now we can play a little game and we can repeat the |
Parametric Bootstrap using a 99% interval this time. So we go |
through that process, we redo all the bootstrap intervals and |
when we did the 99% interval we get an actual coverage of |
approximately 98%. Now suppose we did this game over and over |
again until we found an alpha |
with actual coverage approximately 95%. So in this |
case, suppose we did that and we ended up with finding that 97.6% |
when we asked for a 97.6% interval, we actually got |
something like a 95% coverage. |
Then what we can do is set 1 minus alpha to 0.976 using the Fit |
Model launch dialogue, set alpha option and will get an |
interval that has been Bootstrap Calibrated to have |
approximate coverage 95%. This is still an approximation. There |
is still a simulation component to it, as well as a deeper |
underlying approximation that is extraordinarily hard to analyze, |
but it can be made easy to use, and this is where Hadley comes |
in. Now I'm going to hand it over to him and he will demo the |
add-in and go over the simulations that he did that |
show that we are able to get better coverage rates than |
before by applying Bootstrap |
Calibration to Satterthwaite intervals on linear combinations |
of variance components. Take it away Hadley. |
Thank you very much, Chris, and hello to everyone watching online |
wherever you are. So I'm going to start out by showing you how the add-in |
works and how you can use it to calculate Bootstrap Calibrated |
confidence limits for random components in Mixed Models in |
JMP Pro 16. |
And from there we'll take a step back. We'll see how the add-in |
makes these calculations and I'll highlight some of the additions |
to Mixed Models in JMP Pro 16 that allow it to do that. |
From there, I'll show you the results of some simulation |
studies to give you an idea about how accurate this |
interval estimation method is, the Bootstrap Calibration |
method, and how it compares to some of the other methods for |
calculating confidence limits, as well as the situations where it's |
more or less accurate and some of the limitations and things you |
should be aware of if you're going to be applying it. |
We’ll discuss possibilities for improvements in future work |
just briefly, and from there, I'll conclude by showing you the |
new MSA Designer, Measurement Systems Analysis Designer, |
available from the DOE menu in JMP Pro 16 so that you can quickly |
and easily design and analyze your own MSA Gauge R&R |
studies. So let's start out by looking at this data set. |
This is one that I pulled from the sample data files. |
I'm going to run this Fit Mix script here that I've saved. So |
what we've got here are our random estimates, estimates |
for a random components. |
Now. |
it could be that you want to, for some reason, calculate an |
intermediate total, for example Operator and Part nested with |
Operator, or the three of these, |
you know, Operator and residual. |
So to calculate those is very simple, we simply add these |
estimates, but what's not so simple is to determine those |
confidence limits. |
There's a new feature in Mixed Models that's been added in 16. |
The linear combination of |
The linear combination of variance components feature |
right here, and so what you can do is you can click that. |
You can choose the combination of variance components that you're interested in, |
and you can press done. So now we have an estimate for those. |
Components as well as their |
confidence limits. So, |
what I'm going to do now is I'm going to take this one step |
further and I'm going to calculate the Bootstrap |
Calibrated Satterthwaite estimates and I'm going to do that by |
going to my add-ins and clicking the Bootstrap Calibrated |
confidence intervals there. So from here we can estimate the |
number of simulations. |
2500 is a |
recommended number to the default number. It's also the |
default number in some of the other simulation platforms and |
in JMP Pro. I'm going to choose |
this one. But one thing to note is that it takes some |
time to be able to do this, and so in the interest of time |
what I'm going to do is I'm going to stop it early. |
And here we have our |
calibrated intervals, calibrated upper and lower confidence |
limits added to the report. |
So let's take a step back and see what happened there. |
I'm going to go ahead and add this again. |
Now, one thing that the add-in |
does, as soon as you run it, |
is it adds |
this simulation formula to the data table, so you can see the |
simulation formula here. |
When the add-in is closed, |
the simulation formula disappears. |
The simulation formula there takes advantage of another |
feature that's been added to... |
to the Mixed Models platform in JMP Pro, and that is the Save |
Simulation Formula feature |
here. So what this would allow you to do is to save the |
simulation formula and then to |
use that, for example, to simulate |
these values here. So, we can swap out our “Y” with our |
new simulation formula, |
and go ahead and run that. So when you run the add-in, this is |
all done in the background. But this is how the add-in goes |
about calculating these intervals. So I'm going to stop |
this early, once again in the interest of time. |
And now we see here the samples |
estimated for each. |
simulation. And so how the add-in works is |
it takes all of these. |
And it calculates new estimates for the upper and |
lower Satterthwaite intervals from this |
estimate and this standard error, swapping out different |
values for alpha. So what we're aiming for 0.05, right? |
So that we get 95% upper and lower limits, and what it |
does is it finds |
an alpha value that results in 95% coverage, that is 95% hits |
and 5% misses, |
swaps that in, that's how you get your calibrated intervals. |
So I hope you enjoyed seeing that. I hope you find it useful. |
We've done some simulation studies and what we found out |
is that |
the intervals, which you can see here for four operators |
and 12 days as our random components, |
we've achieved misses of about |
7%, so a 92.8 hit ratio. Now this is better than all of the |
others, including this, so the linear combination, which is |
simply the standard Satterthwaite interval |
calculated on the combination of linear components, as well as the |
Bootstrap quantiles, the bias-corrected intervals in the |
bias-corrected and accelerated intervals, but as you'll see |
these intervals improve, all of them, |
as you increase your number of Operators from |
4 to 8 and the number of Days from 12 to 24. So increasing the |
levels of these random components |
result in much better, much more |
accurate estimates for the confidence limits, and so much so |
that we now have |
a method here |
that is equivalent, |
just, to an |
alpha value of .05. |
So. this improvement in performance of course, comes at |
a cost, and one of those costs is the length of the intervals. |
And so you can see here, |
that with our |
Bootstrap calibrated, well with all of our intervals in fact, |
that when we have increasing number of Operators, |
that the length of the interval is much more |
bundled closer to 0 than it is when you've got smaller |
number of Operators. You can see that this tails out much |
further, so that's this blue area here. That's true for |
all of them, but it's especially true for the |
Bootstrap Calibrated interval. You can see this |
long tail here. |
On average, you're going to get longer lengths using this |
method, but you have a more accurate method. |
Exploring that a little bit deeper, you can see here |
that this increase in length is true for four Operators, as |
well as eight Operators, and it is significant. |
Statistically significant. |
The other thing that I looked at, |
is the effect of adding repetitions, so the difference |
between two repetitions and five repetitions, and what you'll see |
here is that there really is no |
difference. So looking across |
the different sets of combinations from |
four Operators and two reps to four Operators and five reps, about 6 |
measurements total versus 3 measurements, we really don't |
gain anything. All of these are equivalent to each other. |
So that's something to be aware of, that you see improvements in |
accuracy when increasing the number of Operators, and you |
don't see improvements when increasing the number of |
repetitions. |
One thing that I'd like to mention as a |
possibility to improve upon these results is the |
Fractional Random Weight Bootstrap, which we would have |
liked to have been able to implement for this in time for |
this conference. We weren't able to do that, to take this |
and to apply it to random variance components, and so we |
hope to be able to do that in future work and perhaps even |
see an improvement upon the Bootstrap Calibrated interval. |
And then the other thing that I'd like to highlight before I |
go is the new |
MSA designer that's been added to JMP 16, and so |
from here what we can do is we can very quickly |
create our own design in order to be able to |
perform our own MSA or |
Gauge R&R analysis. And so let's see, I'll do this with three |
Operators and Five parts. |
I'll label these |
A, B and C. |
And we'll do one repetition of each. So that's two |
measurements total. |
So here we've got a table with our |
design. What I can do is I can press this button to very |
quickly send that to the different operators, have |
them fill out their parts, send that back to me. |
And then I can add |
those results together. |
So I'll just sort this because I've got another table over here |
where I've done this ahead of time. So I'll just add these |
values over there. And now from the scripts within the table we |
can quickly and easily do our own Measurement Systems Analysis |
and Gauge R&R. So I hope you found this useful. I hope you |
continue to enjoy the talks at this conference. Thank you very |
much for listening. |