cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
hallef18
Level I

are two sets of data different

I have 2 sets of simulated particle trajectory data. I'm trying to figure out if they are statistically significantly different. If it's helpful each data set contains x-coordinate, y-coordinate, distance, energy etc. I've tried plotting histogram with x-coordinate1 (continuous) and x-coordinate2 (continuous). In plotting the normal quantile plot I see both data sets aren't Gaussian so I tried mean testing one data set with the hypothesized value of the other data sets mean and checked the Wilcoxon Signed Rank box (I think this is a test to use for non-normal data).
So produced is that t-test and Signed-Rank. Since the signed-ran Prob > |t| statistic is <.0001 than I reject the null that the actual mean is the same as the hypothesized mean. SO, these two means are different, these (at least the x-coordinate data) are statistically different.

 

Finally my question, does this make sense? Did I use these test correctly? If not, any recommendations? Thank you!

2 REPLIES 2
peng_liu
Staff

Re: are two sets of data different

You seem to be talking about two different things. In your second sentence, you probably mean whether the two data are drawn from the same distribution. Later, based on what you have done, you probably mean whether the two data have the same mean. They are rather different objectives.

For what you have done, please check out this piece of documentation: https://www.jmp.com/support/help/en/18.0/?os=win&source=application#page/jmp/test-mean.shtml

I don't think the nonparametric test is appropriate, and relevant to your objective of comparing means.

Re: are two sets of data different

To expand a little bit on what @peng_liu  said, you have to ask yourself: what do you mean by different? Does different mean different distributions? Or does it mean different spreads? Or different centers? Or different number of modes? Different relationships between the variables? Different number of outliers? etc. Different covariance structure?

 

By just asking if two sets of data are different, the safest answer would probably be YES, it would just depend on HOW they are different. But those differences may not matter to you. So what does matter?

Dan Obermiller