- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
are two sets of data different
I have 2 sets of simulated particle trajectory data. I'm trying to figure out if they are statistically significantly different. If it's helpful each data set contains x-coordinate, y-coordinate, distance, energy etc. I've tried plotting histogram with x-coordinate1 (continuous) and x-coordinate2 (continuous). In plotting the normal quantile plot I see both data sets aren't Gaussian so I tried mean testing one data set with the hypothesized value of the other data sets mean and checked the Wilcoxon Signed Rank box (I think this is a test to use for non-normal data).
So produced is that t-test and Signed-Rank. Since the signed-ran Prob > |t| statistic is <.0001 than I reject the null that the actual mean is the same as the hypothesized mean. SO, these two means are different, these (at least the x-coordinate data) are statistically different.
Finally my question, does this make sense? Did I use these test correctly? If not, any recommendations? Thank you!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: are two sets of data different
You seem to be talking about two different things. In your second sentence, you probably mean whether the two data are drawn from the same distribution. Later, based on what you have done, you probably mean whether the two data have the same mean. They are rather different objectives.
For what you have done, please check out this piece of documentation: https://www.jmp.com/support/help/en/18.0/?os=win&source=application#page/jmp/test-mean.shtml
I don't think the nonparametric test is appropriate, and relevant to your objective of comparing means.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: are two sets of data different
To expand a little bit on what @peng_liu said, you have to ask yourself: what do you mean by different? Does different mean different distributions? Or does it mean different spreads? Or different centers? Or different number of modes? Different relationships between the variables? Different number of outliers? etc. Different covariance structure?
By just asking if two sets of data are different, the safest answer would probably be YES, it would just depend on HOW they are different. But those differences may not matter to you. So what does matter?