Sep 22, 2015 4:20 PM
(10690 views)

An assignment requires that we standardize the data for a particular variable (runtime) and see what a current value would become after standardizing the data. Any ideas on how to do this? Thanks!

Sep 23, 2015 5:21 AM
(16838 views)
| Posted in reply to message from bill_worley 09/23/2015 08:10 AM

And you can create a virtual column this way, without the need to build a formula directly:

Sep 23, 2015 3:04 AM
(9584 views)
| Posted in reply to message from rebecca_drebin0 09/22/2015 07:20 PM

hi Rebecca,

the most trivial way would be to do a distribution of the variable and from the red triangle menu to choose save standardized as in the picture.

this will create a new column in the data table with the standardized values of the variable for each row.

otherwise you can create a new column and insert the Standardized formula manually.

in some cases (such as the fit model platform) you do not need to standardize the data prior to the analysis since you can request the standardized coefficients in the results. to do this you right click on the parameter estimates table and ask for the Std beta under the Columns option.

good luck!

Hi Rebecca

To add on to Ron's reply standardizing data can also be classified as centering and scaling the data. Centering is where you subtract the mean from all values and scaling is dividing the centered data by the standard deviation. You can build the formula for this by first doing Analyze > Distribution and getting the mean and standard deviation values for your data. You can then make a new column and create a column formula:

Runtime value - mean value/Stdev value.

Best,

Bill

Sep 23, 2015 5:21 AM
(16839 views)
| Posted in reply to message from bill_worley 09/23/2015 08:10 AM

And you can create a virtual column this way, without the need to build a formula directly:

Sep 25, 2015 7:14 AM
(9584 views)
| Posted in reply to message from rebecca_drebin0 09/22/2015 07:20 PM

In addition to solutions already offered, it sounded like you might want to use historical data to determine how to standardize new values. If so, then you simply compute and store the mean and standard deviation of the historical sample, then use this computation: (new-mean)/(standard deviation) to standardize the new value. The computation could be performed in a column formula or with a script, depending on the situation.

