Subscribe Bookmark RSS Feed

Normal Distributions and Transformations

natalie_

Community Trekker

Joined:

Jan 6, 2016

Hi Everyone,

 

I have some measured data and when I try a continuous normal fit, I can see that my data is not normal.  However, I can see from the Goodness-of-Fit Test that the data is from the Johnson Su distribution.

 

This distribution has two shape, one location and one scale parameter.  From my research online, I can see how to calculate variance from these parameters and from that the standard deviation.  I used Excel to calculate that, but is there a way in JMP to do this?  From my understanding, the Summary Statics table from the "Distributions" analysis calculates these statistics assuming the data is from the normal distribution.

 

Thanks in advance!

 

Natalie

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson

Super User

Joined:

Jun 22, 2012

Solution
Here is what I do. To set my limits on my original data, based upon the transformed data values, I take the std from the transformed data, calculate what the values above and below the mean are for 1, 2, 3, etc. stds, and then reverse the transformation back to the original data. In some cases, such as the Johnson SU, there isn't an easy way to transform the values back, What I do then, is to run a little script that passes a value through the original transformation, checks the value of the targeted std, then iterates the value until there is a match. Then you have found the value in the original data that when transformed, results in the transformed values targeted value. Remember, when you do this, the distances above and below the mean in your original data will not be the same.
Jim
7 REPLIES
txnelson

Super User

Joined:

Jun 22, 2012

Natalie,

You should be able to simply save the transform to a new column, and then run the distribution on that column.

 

Jim

Jim
David_Burnham

Super User

Joined:

Jul 13, 2011

Natalie

The formula for variance and standard deviation doesn't make any assumption about the shape of the distribution.  It's just algebra (in the same way that the calculation of an average value doesn't make any assumptions about the type of distribution).

-Dave
natalie_

Community Trekker

Joined:

Jan 6, 2016

Oh, I thought it did matter for standard deviation, though.  For example, the 68-95-99.7 (three standard deviations) rule is used to to find the values within a band around the mean in a normal distribution.  However, if my data is not normal, it might not make sense to use this.  For example, if my on resistance of my transistor is not normal, and I want to see what the value is at 3 standard deviations from the mean, I might have a negative value or a very low value that actually doesn't make any sense.

 

Sorry if I am being confusing or misunderstanding something, I am just starting to get back into learning statistics again since university!

David_Burnham

Super User

Joined:

Jul 13, 2011

I think I missed the point of your question.  If you want to calculate "bands" based on probability then the location of these bands will differ according to the type of distribution you have.  Your numbers 68-95-99.7 are not standard deviations, but are probabilities associated with "bands" based on distances of 1,2,3 standard deviations from the mean based on a normal distribution.  If you don't have a normal distribution, the problem is not with the calculation of the standard deviation, but the conversion to probabilities.  If you want to have +/- 3 standard deviation bands then you are assuming the distribution is normal, or at least symmetric.  Depending on what you want to do, you can either calculate assymetric bands (JMP has probability distributions not only for the normal distributions, but for all distributions), or you have to perform a transformation to normalise the data (and then back-transformations whenever you want to convert back to natural metrics).  My preference would be to use asymetric bands and use the JOHNSON SU function to calculate them.

online help

-Dave
natalie_

Community Trekker

Joined:

Jan 6, 2016

Thank you, I see how it did that.  Now that I see that the data is normal, how can I use this to find the standard deviation?  It says in the summary statistics a value that makes sense based on the transformation, but I would like to know what the standard deviation is for the original data.  Perhaps I don't understand the purpose of transforming data.

 

txnelson

Super User

Joined:

Jun 22, 2012

Solution
Here is what I do. To set my limits on my original data, based upon the transformed data values, I take the std from the transformed data, calculate what the values above and below the mean are for 1, 2, 3, etc. stds, and then reverse the transformation back to the original data. In some cases, such as the Johnson SU, there isn't an easy way to transform the values back, What I do then, is to run a little script that passes a value through the original transformation, checks the value of the targeted std, then iterates the value until there is a match. Then you have found the value in the original data that when transformed, results in the transformed values targeted value. Remember, when you do this, the distances above and below the mean in your original data will not be the same.
Jim
natalie_

Community Trekker

Joined:

Jan 6, 2016

Thanks for you reply Jim! I will give this a shot.