cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
  • Register to attend Discovery Summit 2025 Online: Early Users Edition, Sept. 24-25.
  • New JMP features coming to desktops everywhere this September. Sign up to learn more at jmp.com/launch.
Choose Language Hide Translation Bar
34South
Level III

Reporting of skewed versus normally distributed data?

I am involved in a study which evaluates continuous numerical data across multiple groups. While the numbers of replicates are recognised as being low (n=10 per group), it is not possible to rectify that in this study. Likely due to this, some related (similar in nature) variables, which may otherwise have been normally distributed, are skewed (based on Shapiro-Wilk testing). I do apply non-parametric and parametric tests respectively when evaluating the outcome of each variable between the groups in each case but my question is how to report the data? I have always understood that skewed data is reported as medians and range and normally distributed data as means ± standard deviation. However, when reporting this in a manuscript, this approach looks messy, especially when, as I have suggested, the variables are similar in nature, for example left and right anatomical distances. Can I use means and standard deviations throughout the manuscript, even when in some cases the data is skewed? I would still apply the correct type of statistical test.

1 ACCEPTED SOLUTION

Accepted Solutions
P_Bartell
Level VIII

Re: Reporting of skewed versus normally distributed data?

I don't think there is a 'right' or 'wrong' answer here. Recall means and medians are just different measures of central tendency. Standard deviation is but one measure of dispersion. Is there any reason you can't reference both means and medians? After all they are just descriptive statistics. Anyways...I'd tend towards just showing the frequency distributions themselves and let the reader interpret from the pictures themselves, not the mononumerotic 'statistics'.

View solution in original post

9 REPLIES 9
P_Bartell
Level VIII

Re: Reporting of skewed versus normally distributed data?

I don't think there is a 'right' or 'wrong' answer here. Recall means and medians are just different measures of central tendency. Standard deviation is but one measure of dispersion. Is there any reason you can't reference both means and medians? After all they are just descriptive statistics. Anyways...I'd tend towards just showing the frequency distributions themselves and let the reader interpret from the pictures themselves, not the mononumerotic 'statistics'.

34South
Level III

Re: Reporting of skewed versus normally distributed data?

I guess the old statistics adage, "you can do anything, as long as you say what you did", holds! Thanks for providing reassurance that what I propose is acceptable.

statman
Super User

Re: Reporting of skewed versus normally distributed data?

I agree with Pete.  I will say, according to Shewhart (Economic Control...P. 94) both the average and the standard deviation are always useful statistics.  Continuing on in that section of his book he finds the measures of skewness and flatness to be of little additional value.

"All models are wrong, some are useful" G.E.P. Box
hogi
Level XII

Re: Reporting of skewed versus normally distributed data?

There might be other fields where skewness plays an important role.


Just think of lifetimes, mortality, crime rates, meteors ...

- where outliers in one direction are much more frightening than outliers in the other direction.

statman
Super User

Re: Reporting of skewed versus normally distributed data?

You'll have to argue with Shewhart.  Those  comments are directly from his book.

"All models are wrong, some are useful" G.E.P. Box
hogi
Level XII

Re: Reporting of skewed versus normally distributed data?

Hm, not easy ...
With some help of Copilot:

hogi_0-1750489261347.png

 

A response in the spirit of JMP : )
I have to admit, we drifted away from the original question ...

statman
Super User

Re: Reporting of skewed versus normally distributed data?

I love it.  Just goes to show you how wrong AI can be.  The words Special Cause and Common Cause were Deming's.  Shewhart used Assignable and unassignable/chance/random to describe the same.

"All models are wrong, some are useful" G.E.P. Box
hogi
Level XII

Re: Reporting of skewed versus normally distributed data?

Thanks, right!
And the term "common cause" was coined by Harry Alpert in 1947?

I don't know why Shewhart used it in this response.
Perhaps he had heard the new terminology and found it useful.

statman
Super User

Re: Reporting of skewed versus normally distributed data?

"I don't know why Shewhart used it in this response."

He didn't, that's my point.  You are using some AI to generate what it thinks Shewhart would have said and it is wrong.  Please read Deming's "Out of the Crisis" Chapter 11.

Yes, Dr. Alpert coined the phrase on the subject of riots in prison.

"All models are wrong, some are useful" G.E.P. Box

Recommended Articles