cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
tranquilo123
Level I

standardization vs normalization

Hi! 

 

I do not really get it. Is there a difference in normalization and standardization of values in JMP? 

I transformed my values in JMP, Distributions --> Standardized, is this the same as normalization? 

 

If not, how do you conduct normalization for the predictors? 

 

All the best. 

2 ACCEPTED SOLUTIONS

Accepted Solutions
KarenC
Super User (Alumni)

Re: standardization vs normalization

Normalization - that is a big topic. It can be simple to very complex. Normalization is often used to align similar but different data. For example, data from different batches, different runs, different machines but all measuring the same thing are often normalized to removed known sources of variation. One batch runs high whereas another low and a common sample, run in both batches, may be used to adjust the batches so that they can be compared. So while standardization is straight forward to explain and implement, normalization is not.  I am interpreting the use of the word normalization in the context of data alignment and nothing to do with the normal distribution.

View solution in original post

Peter_Bartell
Level VIII

Re: standardization vs normalization

I'm inclined to agree with @KarenC's explanation. To me, normalization is a process by which one attempts to express two or more data sets on a common footing. There can be any number of algorithms for accomplishing this task. For example, in time series analysis, a common technique is to create index series where each value is the original value/value at time 1. This is often done to take two time series whose original values are on very different scales and compare on a common scale on a time series plot where % changes, turning points, etc. are more important than the absolute values of the disparate series.

 

Standardization is one specific form of normalization, which is ((each observation - mu of series)/std. deviation of series)). This is often a very desirable thing to do when modeling data via ordinary least squares regression and a multitude of other EDA and modeling methods. Indeed, JMP often does this 'behind' the scenes in many platforms (or at least offers the option).

View solution in original post

4 REPLIES 4
txnelson
Super User

Re: standardization vs normalization

Standardization is the conversion of the data to a mean of 0 and a standard deviation of 1.  It does not change the shape of the data.  

To transform the data to a normal distribution, JMP's Distribution Platform has the Continuous Fit option under the red triangle.  If you know what the current distribution is, you can select it under that option.  However, you can also select "All" and JMP will evaluate the current data against each of the distributions and determine which distribution best fits it.  Then, from there Save Transformed options for many of the distributions are available.

In addition to this, you may have your prefered method of transforming your data.  A Log, or a Squared transformation.  These can simply be done by creating a new column and then specifying the simple transformation as the formula for the column.

 

Good documentation on the "Options for Continuous Variables" can be found in the Basic Analysis JMP guide

     Help==>Books==>Basic Analysis 

Jim
dale_lehman
Level VII

Re: standardization vs normalization

I believe the question is about standardization vs. normalization, not transformation.  I am hoping someone can give a more definitive answer, but I think transformation is a more generic term covering many types of changes to the data, including both standardization and normalization.  I also think Jim's definition of standardization is correct - it is a term that desribes converting the data to have a mean of 0 and a standard deviation of 1.  I think normalization would include standardization as an option, but might include other ways of converting the data to some kind of uniform scale - for example, calculating how far (on a 0-1 scale) from the minimum value each observation is (i.e., (x-xmin)/(xmax-xmin)). 

KarenC
Super User (Alumni)

Re: standardization vs normalization

Normalization - that is a big topic. It can be simple to very complex. Normalization is often used to align similar but different data. For example, data from different batches, different runs, different machines but all measuring the same thing are often normalized to removed known sources of variation. One batch runs high whereas another low and a common sample, run in both batches, may be used to adjust the batches so that they can be compared. So while standardization is straight forward to explain and implement, normalization is not.  I am interpreting the use of the word normalization in the context of data alignment and nothing to do with the normal distribution.

Peter_Bartell
Level VIII

Re: standardization vs normalization

I'm inclined to agree with @KarenC's explanation. To me, normalization is a process by which one attempts to express two or more data sets on a common footing. There can be any number of algorithms for accomplishing this task. For example, in time series analysis, a common technique is to create index series where each value is the original value/value at time 1. This is often done to take two time series whose original values are on very different scales and compare on a common scale on a time series plot where % changes, turning points, etc. are more important than the absolute values of the disparate series.

 

Standardization is one specific form of normalization, which is ((each observation - mu of series)/std. deviation of series)). This is often a very desirable thing to do when modeling data via ordinary least squares regression and a multitude of other EDA and modeling methods. Indeed, JMP often does this 'behind' the scenes in many platforms (or at least offers the option).