Different p-values obtained from “Fit Y” by X versus “Response Screening”

FactorLevelFox5

Hello:

I have a data set containing 15 observations in three groups/conditions (group 1: 5 observations, group 2: 6 observations, and group 3: 4 observations) for around 3,000 genes (variables) and would like to do ANOVA and post hoc tests for those genes across the 3 groups:

I tried “Response Screening” and got p-values for ANOVA for the 3,000 genes, and then also obtained the respective p-values for comparing observations between different groups (i.e., group 3/group 1, group 2/ group 1) by clicking the “Show Means Differences” and right-clicking “column” to show those p-values.

My question is: when I tested individual variables with “Fit Y by X” I got the same p-value for ANOVA; however, when I proceeded with the post-hoc test (Tukey-HSD / Tukey-Kramer), I got different p-values for the comparisons between groups. Those values are not the same as those obtained from “Response Screening”. How come?

How are the p-values for group comparisons in “Response Screening” determined? I assume they are derived from some sort of post-hoc tests?

Thanks very much in advance for your help!

Victor_G · | Posted in reply to message from FactorLevelFox5 12-16-2024

Hi @FactorLevelFox5,

Welcome in the Community !

The Tukey-Kramer test is an adjusted test for multiple comparisons : "This method protects the overall significance level for the tests of all combinations of pairs."

When you define your confidence level (for example at 0,95), it is defined for ONE comparison, but not for all comparisons involved in your study. In your example, since you have 3 groups, you can have 3 possible paired comparisons : group 1 with 2, group 1 with 3 and group 2 with 3. So your overall confidence level for all comparisons will be equal to 0,95^3 = 0,857.

That means that without p-values adjustment techniques, like Bonferroni or Tukey-Kramer, your overall confidence level will be around 0,86, so you might expect a lot more false positive than considered (14% risk instead of 5%).

With adjustment techniques, the calculations are a little different, so you might expect higher p-values (vs. non-adjusted tests), because it takes into account the total number of comparisons you're doing, and the adjustment is here to minimize the risk of finding a statistical difference between groups when there is none.

Hope this clarify the difference found,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Different p-values obtained from “Fit Y” by X versus “Response Screening”

Re: Different p-values obtained from “Fit Y” by X versus “Response Screening”