Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

- JMP User Community
- :
- Discussions
- :
- How to do hypothesis test with highly right skewed data that contains many zeros...

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Aug 6, 2019 12:00 AM
(1136 views)

The data we got is "defect count on substract", and let's say we implemant a new clean method, and want to know if the new method is better than the original method, that is, we want to perform a hypothesis test to judge it.

But the problem is: for both sample set, the major number is zero, and right skewed to several defect count, in this case is there any good method to perform hypothesis test?

My original idea is transfrom data to normal distribution then perform two sample t test, and since the majority number is zero, I tried to use log(x+1) to transform my data, but it still failed to fit normal distribtution from JMP continuous fit

1 ACCEPTED SOLUTION

Accepted Solutions

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

The logistic distribution is symmetric, so that choice is not the best. for your data. I would not use the Wilcoxon test.

The double exponential distribution (also known as the Gumbel distribution) is skewed, like your data, so it would be a better choice. I would use the Median test.

Learn it once, use it forever!

3 REPLIES 3

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: How to do hypothesis test with highly right skewed data that contains many zeros?

I can think of two approaches. The first is a non-parametric test. They are also available in the Oneway platform along with the t tests. The second way is to define a meaningful sample statistic (e.g., 0.9 quantile) and use a bootstrap to obtain a p-value for the difference..

Learn it once, use it forever!

Highlighted
##

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: How to do hypothesis test with highly right skewed data that contains many zeros?

Thank you for your reply, for nonparametric test, I look up the JMP help, and it says that:

Wilcoxon Test --> powerful for logistic distributions

Median Test --> powerful for double-exponential distributions

van der Waerden Test --> powerful for normal distributions

Kolmogorov Smirnov Test --> not so sure

So my question is for the extreme right skew distribution, which nonparametric method will be more suitbale?

Highlighted

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

The logistic distribution is symmetric, so that choice is not the best. for your data. I would not use the Wilcoxon test.

The double exponential distribution (also known as the Gumbel distribution) is skewed, like your data, so it would be a better choice. I would use the Median test.

Learn it once, use it forever!

Article Labels

There are no labels assigned to this post.