JMP User Community
- :
Discussions
- :
Sample size calculation for TOST

Sample size calculation for TOST

Mar 23, 2020 8:42 AM
Hi everyone!

__Context:__ I have a process A which is my reference, and I developped a process B which I want to assess its comparability to process A. To do so, I plan to use a TOST (equivalence test) using the process A's 3xSTD (standard deviation) as a treshold. Now, I want to know how many time should I repeat process B to be sure to be able to detect such variation (knowing that a run is a significant amount of money).

__Option 1__: I wanted to use the Sample size & Power tool, with 2 sample means comparison, with an alpha of 0.1 (2x0.05 for each process), a STD = 1, no extra parameter, Difference to detect = 3 (3x STD of 1), Power = 0.95, & I obtain 7, so 4 runs of process B to compare to 4 runs of process A to ensure the comparability.

__Option 2__: To use the t-distribution (as shown below) and therefore suggest 6 runs:

What do you think about the two options, which one is best for this case? Do you have any other tool I didn't think of?

Thanks a lot !

@martindemel maybe ?

##

Re: Sample size calculation for TOST

Hello Elofar,

I found an article which seemed to indicate that you could approximate the power calculation by using a t-test where the alpha_ttest = 1- power_equiv, power_ttest = 1-alpha_equiv, and the difference to detect matches the threshold value for the equivalence test. Using the Power and Sample Size calculator for two sample means, you would specify alpha=0.05 (1-0.95), StdDev=1, Difference to detect = 3, and Power=0.9 (1-0.1). That gets you a total sample size of 8, which looks to be in agreement with your Option 1.

Hope that helps!

Re: Sample size calculation for TOST

We developed an app that does TOST sample size calculations. I can't send that out, but I can share the script I used to replicate power calculations for TOST in SAS using Owen's Q. It has the code for 1 and 2-sided scenarios with an example for each. Note, this will not handle small signal-to-noise scenarios well. In my experience with this so far, it only works until nu (sample size parameter) gets to be around 200-250 because Gamma( nu/2 ) runs into a numerical overflow issue and returns a missing value.

```
//Owen's Q
OwensQ = Function( {t, delta, a, b, nu},
c = Sqrt( 2 * Pi() ) / (Gamma( nu / 2 ) * Power( 2, (nu - 2) / 2 ));
integrand = Expr(
Normal Distribution( ((t * x / Sqrt( nu )) - delta) ) * Power( x, nu - 1 ) * Normal Density( x )
);
c * Integrate( integrand, x, a, b, <<Starting Value( Mean( a, b ) ) );
);
/******** 1-Sample TOST Power ****************/
One_TOST_power = expr(
nu = n - 1;
t1 = -t Quantile( conf, nu );
t2 = t Quantile( conf, nu );
delta1 = (mu - muu) / (sigma / Sqrt( n ));
delta2 = (mu - mul) / (sigma / Sqrt( n ));
b = (Sqrt( nu ) * (muu - mul)) / ((2 * sigma / Sqrt( n )) * t Quantile( conf, nu ));
//Power:
OwensQ( t1, delta1, 0, b, nu ) - OwensQ( t2, delta2, 0, b, nu );
);
//1-sample case:
n = 15;
conf = 0.95;
mu = 505;
muu = 510;
mul = 490;
sigma = 4;
One_TOST_power(); //should be 0.9983947 (second part is effectively 0)
/******** 2-Sample TOST Power ****************/
Two_TOST_power = expr(
nu = N - 2;
w1 = 1/(W+1);
w2 = W/(W+1);
t1 = -t Quantile( conf, nu );
t2 = t Quantile( conf, nu );
//For Unequal Sample Sizes
//Non-centrality parameters:
delta1 = (mudiff - mudiff_u) / (sigma / (Sqrt( N ) * Sqrt( w1 * w2 )));
delta2 = (mudiff - mudiff_l) / (sigma / (Sqrt( N ) * Sqrt( w1 * w2 )));
//Upper integration limit:
b = (Sqrt( nu ) * (mudiff_u - mudiff_l)) / ((2 * sigma / (Sqrt( N ) * Sqrt( w1 * w2 ))) * t Quantile( conf, nu ));
Power = OwensQ( t1, delta1, 0, b, nu ) - OwensQ( t2, delta2, 0, b, nu );
);
//2-sample equal sizes:
N = 36;
W = 1;
conf = 0.95;
mudiff_l = -2;
mudiff_u = 2;
mudiff = 0;
sigma = 2;
Two_TOST_power(); //Should be just over 0.8
//2-sample unequal sizes:
N = 42;
W = 2; //unequal sample sizes:
conf = 0.95;
mudiff_l = -2;
mudiff_u = 2;
mudiff = 0;
sigma = 2;
Two_TOST_power(); //Should be just over 0.8
```

-- Cameron Willden

Re: Sample size calculation for TOST

Thank you for that answer, not sure I understood the whole script tho, what am I supposed to change in?

My concern is that this experiment is "official" (will be submitted to authorities), therefore I am not sure about using such script ...

Re: Sample size calculation for TOST

For the one-sample case, you specify the values of these variables:

```
n = 15; //total sample size
conf = 0.95; //confidence
mu = 505; //hypothesized mean
muu = 510; //upper equivalence limit
mul = 490; //lower equivalence limit
sigma = 4; //estimated standard deviation
```

For the two-sample case, you specify the values of these variables:

```
N = 36; //total sample size
W = 1; //weighting for sample size (e.g. use 2 for 1 of the groups to have 2x as many samples as the other
conf = 0.95; //confidence level
mudiff_l = -2; //lower equivalence limit on mean difference
mudiff_u = 2; //upper equivalence limit on mean difference
mudiff = 0; //estimated mean difference
sigma = 2; //estimated standard deviation
```

To find a sample size, you would just need to implement a loop to increase N until the desired power is achieved.

I understand your concern about needing to submit this to an authority for review. I work in a regulated business myself, and we have a validation protocol for the JMP add-in with the calculators we built based upon this script. The SAS documentation for PROC POWER shows the formulas for the power calculation for 1 and 2 sample TOST. You could reference that document to show the calculations are accurate, reproduce it in another software (e.g. R, I do have a script for that if you would like), or confirm power estimates through simulation. If you have SAS in anywhere in your organization, you can just get a proc power printout and be done.

-- Cameron Willden

Re: Sample size calculation for TOST

Alright great thanks a lot !

I don't have R so I'll try this with the script !

Re: Sample size calculation for TOST

That sounds good, thank you! Couple of questions:

1. About the alpha, we were told that since we have 2 populations, we have to apply the alpha of 1-0.95 to both, so to use 0.05x2=0.1 in the global alpha of the tool, what do you think about that?

2. About the power, is that usual to use 0.9 ?

Thanks a lot for your help!