cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
natalie_
Level V

Normal Distributions and Transformations

Hi Everyone,

 

I have some measured data and when I try a continuous normal fit, I can see that my data is not normal.  However, I can see from the Goodness-of-Fit Test that the data is from the Johnson Su distribution.

 

This distribution has two shape, one location and one scale parameter.  From my research online, I can see how to calculate variance from these parameters and from that the standard deviation.  I used Excel to calculate that, but is there a way in JMP to do this?  From my understanding, the Summary Statics table from the "Distributions" analysis calculates these statistics assuming the data is from the normal distribution.

 

Thanks in advance!

 

Natalie

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Normal Distributions and Transformations

Here is what I do. To set my limits on my original data, based upon the transformed data values, I take the std from the transformed data, calculate what the values above and below the mean are for 1, 2, 3, etc. stds, and then reverse the transformation back to the original data. In some cases, such as the Johnson SU, there isn't an easy way to transform the values back, What I do then, is to run a little script that passes a value through the original transformation, checks the value of the targeted std, then iterates the value until there is a match. Then you have found the value in the original data that when transformed, results in the transformed values targeted value. Remember, when you do this, the distances above and below the mean in your original data will not be the same.
Jim

View solution in original post

46 REPLIES 46
txnelson
Super User

Re: Normal Distributions and Transformations

Natalie,

You should be able to simply save the transform to a new column, and then run the distribution on that column.

 

Jim

Jim
David_Burnham
Super User (Alumni)

Re: Normal Distributions and Transformations

Natalie

The formula for variance and standard deviation doesn't make any assumption about the shape of the distribution.  It's just algebra (in the same way that the calculation of an average value doesn't make any assumptions about the type of distribution).

-Dave
natalie_
Level V

Re: Normal Distributions and Transformations

Oh, I thought it did matter for standard deviation, though.  For example, the 68-95-99.7 (three standard deviations) rule is used to to find the values within a band around the mean in a normal distribution.  However, if my data is not normal, it might not make sense to use this.  For example, if my on resistance of my transistor is not normal, and I want to see what the value is at 3 standard deviations from the mean, I might have a negative value or a very low value that actually doesn't make any sense.

 

Sorry if I am being confusing or misunderstanding something, I am just starting to get back into learning statistics again since university!

David_Burnham
Super User (Alumni)

Re: Normal Distributions and Transformations

I think I missed the point of your question.  If you want to calculate "bands" based on probability then the location of these bands will differ according to the type of distribution you have.  Your numbers 68-95-99.7 are not standard deviations, but are probabilities associated with "bands" based on distances of 1,2,3 standard deviations from the mean based on a normal distribution.  If you don't have a normal distribution, the problem is not with the calculation of the standard deviation, but the conversion to probabilities.  If you want to have +/- 3 standard deviation bands then you are assuming the distribution is normal, or at least symmetric.  Depending on what you want to do, you can either calculate assymetric bands (JMP has probability distributions not only for the normal distributions, but for all distributions), or you have to perform a transformation to normalise the data (and then back-transformations whenever you want to convert back to natural metrics).  My preference would be to use asymetric bands and use the JOHNSON SU function to calculate them.

online help

-Dave
kowa
Level II

Re: Normal Distributions and Transformations

hello David and everyone ,

 

I'm new to JMP and I'm wondering about data transformation.
what are the advantages/disadvantages if we work with data fitting compared to data transformation to obtain a normal distribution?

 

Also after the transformation if I don't have a good p-Value should I consider my transformation?

 

thank you

statman
Super User

Re: Normal Distributions and Transformations

Kowa, first welcome to the community.  Your query cannot be answered sufficiently in this forum.  I suggest you start here:

 

https://www.ime.usp.br/~abe/lista/pdfQWaCMboK68.pdf

 

There are 2 primary reasons to do transformation:

1. Meet the quantitative assumptions of normally and independently distributed residuals with a mean of 0 and a consent variance (NID(0, variance).  If these assumptions are not met, then you should question the proposed model.

2. What Dr. Box told me years ago, the only reason to transform, is to simplify the model...in essence make the model more useable.

 

"All models are wrong, some are useful" G.E.P. Box
natalie_
Level V

Re: Normal Distributions and Transformations

Thank you, I see how it did that.  Now that I see that the data is normal, how can I use this to find the standard deviation?  It says in the summary statistics a value that makes sense based on the transformation, but I would like to know what the standard deviation is for the original data.  Perhaps I don't understand the purpose of transforming data.

 

txnelson
Super User

Re: Normal Distributions and Transformations

Here is what I do. To set my limits on my original data, based upon the transformed data values, I take the std from the transformed data, calculate what the values above and below the mean are for 1, 2, 3, etc. stds, and then reverse the transformation back to the original data. In some cases, such as the Johnson SU, there isn't an easy way to transform the values back, What I do then, is to run a little script that passes a value through the original transformation, checks the value of the targeted std, then iterates the value until there is a match. Then you have found the value in the original data that when transformed, results in the transformed values targeted value. Remember, when you do this, the distances above and below the mean in your original data will not be the same.
Jim
natalie_
Level V

Re: Normal Distributions and Transformations

Thanks for you reply Jim! I will give this a shot.