cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Justin_Bui
Level III

Newbie to statistic: Is that a P-hacking if I do equivalence test several time?

Hi all, 


I'm new to statistic & really need your help here. 


My question is: Is that ok or not OK if I do equivalence test several times to find the equivalent limit that has 2 p-values < 5%. 

 

I have 2 groups data like below & I want to test if they are equal or different. 

Justin_Bui_0-1667441965228.png

1st step I did a t-test & Pvalue> 5% mean I failed to reject H0  (Test= ref). Not really help much to define my next actions. 

 

So I do an equivalence test with the equivalent limit of +/-0.1 as a practical range

But again the result is not really good. I just can conclude that my Test group is higher then REF 0.1 unit or lower. 

Still not a useful enough result for next actions 

Justin_Bui_1-1667442200244.png

 

So I increase several time until reach the equivalent limit 0.16 unit & have 2 P-value <5% 

Which I can interpret result as 2 group is equal in 0.16 unit range.

Meaning that My TEST group can higher/ lower than REF +/- 0.16 unit. (or could be equal too)

Justin_Bui_0-1667443880792.png

 

So I can estimate the best case if this 0.16 unit higher or lower the current is OK for my next action. 


Is my conclusion correct? & Is that possible for me to do that? or Am I P-hacking? 

Just a newbie in statistic need your help. 
Thanks all 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Newbie to statistic: Is that a P-hacking if I do equivalence test several time?

Adding to @Phil_Kay, you should have a clear, well-defined question or purpose for the test you are to perform. Your example is a case of p-hacking. Only one of these tests is relevant to your purpose. You should not apply more than one test until you find a favorable result. One test suffices, and you accept the answer either way.

 

Let's say that you made a change hoping for an improvement. Then you want to use a test for a significant difference. (The null hypothesis or 'Devil's advocate' is that the is no change or improvement.) Your first comparison is an example of this kind of test.

 

Instead, perhaps you made a change but hope that it makes no difference. Like swapping a new piece of equipment for an old piece hoping to continue to see the same outcome. Then you want to use a test for significant or practical equivalences. You second pair of comparisons is an example of this kind of test.

 

The risk level is usually established beforehand. Adjusting the practical difference or the level of significance while performing tests is also p-hacking of sorts.

View solution in original post

3 REPLIES 3
Phil_Kay
Staff

Re: Newbie to statistic: Is that a P-hacking if I do equivalence test several time?

Hi @Justin_Bui,

It's a good question. But p-hacking aside, I think that you are misinterpreting the equivalence test. These things are confusing, I know!

The low p-value in your last case is telling you that the 2 groups ARE practically equivalent. The confidence bounds around the estimated difference are within the practical range of +/- 0.16.

The help documentation for this test should be useful.

Hopefully this helps,

Phil

Re: Newbie to statistic: Is that a P-hacking if I do equivalence test several time?

Adding to @Phil_Kay, you should have a clear, well-defined question or purpose for the test you are to perform. Your example is a case of p-hacking. Only one of these tests is relevant to your purpose. You should not apply more than one test until you find a favorable result. One test suffices, and you accept the answer either way.

 

Let's say that you made a change hoping for an improvement. Then you want to use a test for a significant difference. (The null hypothesis or 'Devil's advocate' is that the is no change or improvement.) Your first comparison is an example of this kind of test.

 

Instead, perhaps you made a change but hope that it makes no difference. Like swapping a new piece of equipment for an old piece hoping to continue to see the same outcome. Then you want to use a test for significant or practical equivalences. You second pair of comparisons is an example of this kind of test.

 

The risk level is usually established beforehand. Adjusting the practical difference or the level of significance while performing tests is also p-hacking of sorts.

Justin_Bui
Level III

Re: Newbie to statistic: Is that a P-hacking if I do equivalence test several time?

Thanks Phil for your sharing