cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Different p-values obtained from “Fit Y” by X versus “Response Screening”

Hello: 


I have a data set containing 15 observations in three groups/conditions (group 1: 5 observations, group 2: 6 observations, and group 3: 4 observations) for around 3,000 genes (variables) and would like to do ANOVA and post hoc tests for those genes across the 3 groups: 

 

I tried “Response Screening” and got p-values for ANOVA for the 3,000 genes, and then also obtained the respective p-values for comparing observations between different groups (i.e., group 3/group 1, group 2/ group 1) by clicking the “Show Means Differences” and right-clicking “column” to show those p-values.

 

My question is: when I tested individual variables with “Fit Y by X” I got the same p-value for ANOVA; however, when I proceeded with the post-hoc test (Tukey-HSD / Tukey-Kramer), I got different p-values for the comparisons between groups.  Those values are not the same as those obtained from “Response Screening”.  How come?


How are the p-values for group comparisons in “Response Screening” determined? I assume they are derived from some sort of post-hoc tests?

 

Thanks very much in advance for your help!

1 ACCEPTED SOLUTION

Accepted Solutions
Victor_G
Super User

Re: Different p-values obtained from “Fit Y” by X versus “Response Screening”

Hi @FactorLevelFox5,

 

Welcome in the Community !

 

The Tukey-Kramer test is an adjusted test for multiple comparisons : "This method protects the overall significance level for the tests of all combinations of pairs."

 

When you define your confidence level (for example at 0,95), it is defined for ONE comparison, but not for all comparisons involved in your study. In your example, since you have 3 groups, you can have 3 possible paired comparisons : group 1 with 2, group 1 with 3 and group 2 with 3. So your overall confidence level for all comparisons will be equal to 0,95^3 = 0,857.

That means that without p-values adjustment techniques, like Bonferroni or Tukey-Kramer, your overall confidence level will be around 0,86, so you might expect a lot more false positive than considered (14% risk instead of 5%).

With adjustment techniques, the calculations are a little different, so you might expect higher p-values (vs. non-adjusted tests), because it takes into account the total number of comparisons you're doing, and the adjustment is here to minimize the risk of finding a statistical difference between groups when there is none. 

 

Hope this clarify the difference found,

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

3 REPLIES 3
Victor_G
Super User

Re: Different p-values obtained from “Fit Y” by X versus “Response Screening”

Hi @FactorLevelFox5,

 

Welcome in the Community !

 

The Tukey-Kramer test is an adjusted test for multiple comparisons : "This method protects the overall significance level for the tests of all combinations of pairs."

 

When you define your confidence level (for example at 0,95), it is defined for ONE comparison, but not for all comparisons involved in your study. In your example, since you have 3 groups, you can have 3 possible paired comparisons : group 1 with 2, group 1 with 3 and group 2 with 3. So your overall confidence level for all comparisons will be equal to 0,95^3 = 0,857.

That means that without p-values adjustment techniques, like Bonferroni or Tukey-Kramer, your overall confidence level will be around 0,86, so you might expect a lot more false positive than considered (14% risk instead of 5%).

With adjustment techniques, the calculations are a little different, so you might expect higher p-values (vs. non-adjusted tests), because it takes into account the total number of comparisons you're doing, and the adjustment is here to minimize the risk of finding a statistical difference between groups when there is none. 

 

Hope this clarify the difference found,

 

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Re: Different p-values obtained from “Fit Y” by X versus “Response Screening”

Thanks for your quick reply! So, the p-value in the "means difference" (response screening) is just a t-test (not a post hoc test after ANOVA)?
If so, is there any way to obtain the p-value of the post hoc analysis in the Response screening?
Thanks again!

jmp.jpg

Victor_G
Super User

Re: Different p-values obtained from “Fit Y” by X versus “Response Screening”

Hi @FactorLevelFox5,

 

Looking at the Help section of JMP, you can read that p-values in Response Screening platform are obtained by Student's t test for a pairwise comparison : Means Differences Data Table 

 

However, you can also see a column with "FDR p-values", that are obtained using the same test but corrected for multiple comparisons using False Discovery Rate technique : Statistical Details for the Response Screening Platform

You can read more about this technique here : Benjamini, Y., and Hochberg, Y. (1995). “Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.” Journal of the Royal Statistical SocietySeries B 57:289–300 or Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

 

The False Discovery Rate is not part of the corrected test options in Fit Y by X, as the emphasis is a little different between the corrections proposed in the two platforms :

  • In Fit Y by X, the multiple tests correction may be needed for a categorical variable X with many groups and a continuous variable Y, as you will test in a non-independant way each possible combinations : a possible response difference between each pair of groups for the same variable X.
  • In Response Screening, the multiple tests correction may be needed when you investigate the statistical significance of many independant X-Y relationships.

 

Hope this answer will clarify the report and use of this platform,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)