Hadley Myers, JMP Systems Engineer, SAS
Chris Gotwalt, JMP Director of Statistical Research and Development, SAS
The need to determine confidence intervals for linear combinations of random mixed-model variance components, especially critical in Pharmaceutical and Life Science applications, was addressed with the creation of a JMP Add-In, demonstrated at the JMP Discovery Summit Europe 2020 and available at the JMP User Community . The add-in used parametric bootstrapping of the sample variance components to generate a table of simulated values and calculated “bias-corrected” (BC) percentile intervals on those values. BC percentile intervals are better in accounting for asymmetry in simulated distributions than standard percentile intervals, and a simulation study using a sample data set at the time showed closer-to-true α-values with the former. This work reports on the release of Version 2 of the Add-In, which calculates both sets of confidence intervals (standard and BC percentiles), as well as a third set, the “bias-corrected and accelerated” confidence interval, which has the advantage of adjusting for underlying higher-order effects. Users will therefore have the flexibility to decide for themselves the appropriate method for their data. The new version of the Add-In will be demonstrated, and an overview of the advantages/disadvantages of each method will be addressed.
|
Speaker |
Transcript |
| Hello, my name is Chris Gotwalt | |
| 00 | 08.966 |
| 3 | |
| has been developed for variance | |
| components models, we we think | |
| 00 | 25.566 |
| 7 | |
| statistical process control | |
| program, one has to understand | |
| 00 | 40.466 |
| 11 | |
| ascertain how much measurement | |
| error is attributable to testing | |
| 00 | 55.500 |
| 15 | |
| there might be five or 10 units | |
| or parts tested per operator, | |
| 00 | 10.766 |
| 19 | |
| different measuring tools is | |
| small enough that differences in | |
| 00 | 26.033 |
| 23 | |
| measurement to measurement, | |
| repeatability variation, or | |
| 00 | 39.900 |
| 27 | |
| measurement systems analyses, as | |
| well as a confidence interval on | |
| 00 | 52.766 |
| 31 | |
| interval estimates in the report | |
| and obtain a valid 95% interval | |
| 00 | 07.033 |
| 35 | |
| calculate confidence intervals, | |
| because we believed it would be | |
| 00 | 23.400 |
| 39 | |
| and the sum of the variance | |
| components. Unfortunately, the | |
| 00 | 38.266 |
| 43 | |
| r&r study. So because variance | |
| components explicitly violate | |
| 00 | 57.300 |
| 47 | |
| you were to use the one click | |
| bootstrap on variance components | |
| 00 | 10.566 |
| 51 | |
| less. So when we were designing | |
| fit mixed, and the REML | |
| 00 | 27.933 |
| 55 | |
| independent. So back to the | |
| drawing board. So it turns out | |
| 00 | 44.666 |
| 59 | |
| in JMP. One approach is called | |
| the parametric bootstrap that | |
| 00 | 01.333 |
| 63 | |
| comparison of the two kind of | |
| families of bootstrap. So the | |
| 00 | 18.333 |
| 67 | |
| they're, they're not assuming | |
| any underlying model. And it's | |
| 00 | 37.200 |
| 71 | |
| the rows in the data table are | |
| independent from one another. | |
| 00 | 52.766 |
| 75 | |
| values, it has the advantage | |
| that we don't have to make this | |
| 00 | 09.866 |
| 79 | |
| bootstrap simulation. The | |
| downside to this is that you | |
| 00 | 25.133 |
| 83 | |
| do a quick introduction to what | |
| the bootstrap...the parametric | |
| 00 | 41.966 |
| 87 | |
| to identify or wanted to | |
| estimate the crossing time of a | |
| 00 | 04.733 |
| 91 | |
| 162.8. Now, we want to use a | |
| parametric bootstrap to to go | |
| 00 | 22.466 |
| 95 | |
| has the ability to save the | |
| simulation formula back to the | |
| 00 | 35.933 |
| 99 | |
| that uses the estimates in the | |
| report as inputs into a random | |
| 00 | 53.300 |
| 00.666 | |
| 104 | |
| And we take our estimates and | |
| pull them out into a separate | |
| 00 | 17.666 |
| 108 | |
| And then what we have can be | |
| seen as a random sample from the | |
| 00 | 37.000 |
| 112 | |
| formula column for the crossing | |
| time. And that is automatically | |
| 00 | 53.900 |
| 116 | |
| those...on the crossing time, or | |
| any quantity of interest. When | |
| 00 | 15.366 |
| 120 | |
| simulation, create a formula | |
| column of whatever function of | |
| 00 | 28.366 |
| 124 | |
| derive quantity of interest and | |
| obtain confidence intervals | |
| 00 | 47.233 |
| 128 | |
| the add in so that you're able | |
| to do this quite easily for | |
| 00 | 05.033 |
| 132 | |
| 133 | |
| we'll start by showing you how | |
| to run the add in yourself once | |
| 00 | 25.500 |
| 137 | |
| first version was presented at | |
| the JMP 2020 Discovery Summit | |
| 00 | 42.566 |
| 141 | |
| overview, but we'll show you the | |
| references where you can dive in | |
| 00 | 58.866 |
| 145 | |
| perfectly fine as well. So I'm | |
| going to go ahead and start with | |
| 00 | 14.700 |
| 149 | |
| makes use of the fit mixed | |
| platform, right, created from | |
| 00 | 31.333 |
| 153 | |
| the add in will only work with | |
| JMP Pro. So someone might, | |
| 00 | 49.066 |
| 157 | |
| want some measure like | |
| reproducibility. So that would | |
| 00 | 10.166 |
| 161 | |
| as we said, to calculate the | |
| estimate for these, there's no | |
| 00 | 26.066 |
| 165 | |
| columns here. The reality is | |
| much, much, much more | |
| 00 | 43.066 |
| 169 | |
| of the estimate without | |
| considering the worst case | |
| 00 | 59.233 |
| 173 | |
| production that the actual | |
| variance is higher than they have | |
| 00 | 19.733 |
| 177 | |
| don't risk being out of spec in | |
| production. So to run the add in | |
| 00 | 35.700 |
| 181 | |
| From here, I can select the | |
| linear combination of confidence | |
| 00 | 55.266 |
| 185 | |
| simulations, you get a better | |
| estimate of the confidence | |
| 00 | 10.500 |
| 189 | |
| 2500. I'm going to leave it as | |
| 1000 here just for demonstration | |
| 00 | 28.733 |
| 193 | |
| operator or the batch variable, | |
| and then press perform analysis. | |
| 00 | 45.533 |
| 197 | |
| purpose of this demonstration, I | |
| think I will stop it early. | |
| 00 | 07.733 |
| 201 | |
| calculated confidence limits, the | |
| bootstrap quantiles, which are | |
| 00 | 28.933 |
| 205 | |
| these two tabs. But if you'd | |
| like to see how those compare, | |
| 00 | 42.366 |
| 209 | |
| so what does enough mean, enough | |
| for your confidence limits to | |
| 00 | 57.400 |
| 213 | |
| stopped it before a thousand. So | |
| that's how the add in works. And | |
| 00 | 15.466 |
| 217 | |
| distributed around the original | |
| estimate, they are in fact | |
| 00 | 37.366 |
| 221 | |
| relaunch this analysis. So | |
| you'll see that when the | |
| 00 | 56.433 |
| 225 | |
| European Discovery, required | |
| bounded variance confidence | |
| 00 | 16.766 |
| 229 | |
| that, if that happens for some | |
| of the bootstrap samples or for | |
| 00 | 40.466 |
| 233 | |
| early, again, I'll just let it | |
| run a little bit. Yeah, so I, as | |
| 00 | 00.966 |
| 237 | |
| the samples are allowed, in some | |
| cases, to be below zero. So in | |
| 00 | 28.400 |
| 242 | |
| simulation column here, this | |
| column of simulated | |
| 00 | 49.100 |
| 246 | |
| see them both at the same time. | |
| It's a bit... it's a bit tricky, | |
| 00 | 16.300 |
| 252 | |
| right components, it's a good | |
| idea to run the add in directly | |
| 00 | 31.766 |
| 256 | |
| that column is then deleted. So | |
| one thing to to mention, before | |
| 00 | 50.500 |
| 260 | |
| accounts for the skewness of the | |
| bootstrap distributions, right, | |
| 00 | 14.200 |
| 264 | |
| that. And then the accelerated | |
| takes that even further. So here | |
| 00 | 27.200 |
| 268 | |
| thing to mention is that the | |
| alpha in this represents the | |
| 00 | 43.700 |
| 272 | |
| value for which it's been | |
| calculated? And what can we do to | |
| 00 | 03.233 |
| 276 | |
| up to investigate the four | |
| different kinds of the variance | |
| 00 | 24.700 |
| 280 | |
| method, the bias corrected | |
| method and the BCa. We also | |
| 00 | 43.000 |
| 284 | |
| study. So for all 16 | |
| combinations of these three | |
| 00 | 01.566 |
| 288 | |
| combinations of confidence | |
| intervals, and kept track of how | |
| 00 | 20.400 |
| 292 | |
| 293 | |
| coverage as we're varying these | |
| three variables, and we see here | |
| 00 | 45.300 |
| 297 | |
| 298 | |
| techniques. And the second best | |
| is the bias corrected and | |
| 00 | 09.966 |
| 302 | |
| the best one. Now, if you turn | |
| no bounds on, which means that | |
| 00 | 28.433 |
| 306 | |
| variance components with a | |
| pretty close to 95% coverage. | |
| 00 | 48.200 |
| 310 | |
| intervals are performing | |
| similarly at about 93%. But | |
| 00 | 02.200 |
| 07.800 | |
| 315 | |
| to what a master's thesis paper's | |
| research would have, would | |
| 00 | 27.966 |
| 319 | |
| potentially more work to be | |
| done. There's other interval | |
| 00 | 42.566 |
| 323 | |
| things like generalized | |
| confidence intervals. General | |
| 00 | 59.466 |
| 327 | |
| intervals might also do the | |
| trick for us as well. Hadley's | |
| 00 | 19.966 |
| 331 | |
| so that you can now do | |
| parametric bootstrap simulations | |
| 00 | 37.566 |
| 335 | |
| 16. When you bring that up, you | |
| can enter the linear combination | |
| 00 | 51.766 |
| 339 |