Solved: Curve fitting and normalization

Report Inappropriate Content · Nov 19, 2016 09:07 AM

I have got a data set which is positively skewed and therefore need to normalize it. I have got a wave form line with age and challenging behaviours present in Individuals with intellectual disabilities. I am totally new to JMP and advanced statistical procedures and this is a big step in requirement of my PhD research. Any step by step guidelines or videos in scale development, curve fitting, normalizing data...Would greatly appreciate any help.

Peter_Bartell · Nov 21, 2016 02:05 PM

It appears from the article that the author applied one of the family of Johnson transformations to the variables of interest. Within JMP there's a couple workflows to do something similar with Johnson transformations.

Workflow 1: Create a usual Distribution platform report using the untransformed original units variable. From the Distribution platform report, from the JMP hot spot above the frequency distribution graphic, choose the Continuous Fit drop down option, then the desired Johnson transformation. You can save the transformed variable to the data table from the Johnson Transform graphic hot spot by selecting Save Transformed.

Workflow 2: I think you can use only if you are running JMP version 12 or higher is, from the JMP data table select the specific column you want to transform, then from the column header, right click, New Formula Column -> Distributional -> Johnson Normalizing. This workflow creates a new column in the data table corresponding to the Johnson Sb transform.

View solution in original post

Peter_Bartell · Nov 21, 2016 11:51 AM

I guess the first question I have is why you feel you must 'normalize' the data just because the distribution of a specific variable is skewed? Skewness by itself is not necessarily a reason for transforming the data in some fashion. Usually the need for transformation is driven more by things like the data analysis methods under consideration for some such external criteria. There are many analytical methods that are either robust to distributional assumptions or just flat out don't apply. Perhaps you can share a bit more around the practical goals of your study and the methods by which you are considering analyzing the data and some of us may be able to offer more guidance?

bj · Nov 21, 2016 01:22 PM

Thank you for the response. The reason as to why I want to normalize the data is, I want to be able to generate norms for the scale that I have developed and to establlish psychometric properties. I have been reading this article https://ia801404.us.archive.org/27/items/ERIC_ED353271/ERIC_ED353271.pdf, which is similar to what I want to do.

I need help to execute this curve fitting in JMP. Thanks again.

Peter_Bartell · Nov 21, 2016 02:05 PM

It appears from the article that the author applied one of the family of Johnson transformations to the variables of interest. Within JMP there's a couple workflows to do something similar with Johnson transformations.

Workflow 1: Create a usual Distribution platform report using the untransformed original units variable. From the Distribution platform report, from the JMP hot spot above the frequency distribution graphic, choose the Continuous Fit drop down option, then the desired Johnson transformation. You can save the transformed variable to the data table from the Johnson Transform graphic hot spot by selecting Save Transformed.

Workflow 2: I think you can use only if you are running JMP version 12 or higher is, from the JMP data table select the specific column you want to transform, then from the column header, right click, New Formula Column -> Distributional -> Johnson Normalizing. This workflow creates a new column in the data table corresponding to the Johnson Sb transform.

bj · Nov 22, 2016 01:45 AM

Thank you again for the guidance. The real problem is- for example- The subscale has 9 items in four point rating (0-3), so the total score is 27. In my frequency distribution with the sample of 207, range (0-27), I have raw score values namely-15, 18, 19, 20 etc., not present. When I run JSb what will be the corresponding z values for the raw score values not present in my data set. How will I be able to generate norms in that case? Also, I noticed that the scaled scores were widely spread, in the sense that raw scores of 0 had scaled score 10 ( z converted into scaled score with mean of 15 and sd of 3, which is a reference index for challenging behaviours and intellectual disabilities) and the remaining scores (1-27) excluding the scores not present fell into scaled score of 17. I also tried with T scores, it is the same observation. The authors of similar work talk about adjusting means and SDs, smooth curve with age groups, generate moments and input into JohnsonSb, to conversion table. But I do not see a detailed procedural explanation on that. I am not getting it as to what frequency distribution goes into JSb with given moments? JSb allows user defined parameters... Where do these parameters come from and applied on what?. I need an understanding with the application of JSb. Appreciate any help. Thanks.

bj · Nov 28, 2016 10:20 AM

From my previous post, after couple of weeks of mind boggling and extensive research, I have come up with these steps -including a section of my chapter writing. Can somebody please peer review and validate, if this is okay? Thanks.

Norms Development

Development of the subscales and Challenging Behavior Composite (CBC) norms was done in several stages. The procedures were adopted from the works of Jing-Jen Wang (1992)[1] and Sparrow, Cicchetti & Balla (2005)[2]. The ages (4-58) were divided into 12 age groups and the corresponding mean, standard deviation, skewness and kurtosis of the raw scores distribution were computed for the five subscales in each of the age groups. Line graphs of the 12 means against age were drawn separately for each subscale, and a smooth line was traced through the mean. In the same manner, standard deviations across 12 age groups were plotted and smoothed.

The smoothed means and standard deviations, and the unsmoothed skewness and kurtosis values of each age group for each subscale were further input into several stages of standardization. The raw scores were transformed into standard scores, which in turn were generated into another distribution based on the smoothed mean and standard deviation. The distribution (with smoothed mean and standard deviation) was input to generate Johnson bounded distribution[3], which in turn were converted into Challenging Behaviour Rating Scale (CBRS) scores[4] (Mean=15, SD=3). To obtain norms beyond range, linear regression with raw scores and CBRS scale scores was performed. The predicted values were used as CBRS scaled scores.

In the next step, the conversion for the 12 age groups was expanded to include 55 age groups (4-58). Linear interpolation (for missing values between adjacent age groups) was used to fill the gaps. ( I tried extrapolation for missing values beyong age groups- results were not meaningful, so didnt do that)

[1] Jiing-Jen Wang. (1992). An Analytical Approach to Generating Norms for Skewed Normative Distributions. Paper presented at the Annual Meeting of the National Council on Measurement in Education, San Francisco, California.

[2] Sparrow, S.S., Cicchetti, V.D., & Balla, A.D. (2005). Vineland-II Survey Forms Manual. NCS Pearson Inc. United Sates of America

[3] Johnson Curves are fitted using moments (γ, δ, θ, σ) generated with values of mean, standard deviation, skewness and kurtosis of the distribution

[4] Vineland-II uses a scale score with mean (15) and standard deviation (3) for its maladaptive behavior scale. It is considered to be a relative measure to the scaled scores of many other tests.

Steven_Moore · Nov 28, 2016 12:44 PM

You need to read Dr. Donald Wheeler's paper:

Transforming the Data Can Be Fatal to Your Analysis

The misguided love affair with the normal distribution must come to an end!

Steve

bj · Nov 29, 2016 03:02 AM

Yes, Thank you. I surely will. But I also think, I do not have enough knowledge to contest this viewpoint and also accept the fact that it can be a debate like- wholism Vs reductionism, quantiative Vs qualitative, type I error Vs type II error etc. The only defense I can make up for my work is that the scale will be more useful at system level- to classify those who need intensive therapy, those of whose parents or teachers can be educated on behaviour modification and therefore avert potential problems and never the less, them who need positive behavioural suppot. Its more like system based provisions (response to intervention model) rather than individual approach. A child walking into a therapy clinic will still be worked upon criterion referenced.

Thank you for the suggestion.

Steven_Moore · Nov 29, 2016 03:19 PM

bj,

Based on your latest input, perhaps you need to analyse your data with some predictive modelling technique such as neural networking or partitioning.

Steve

bj · Nov 29, 2016 09:21 PM

Thank you. Will explore that.

Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Re: Curve fitting and normalization

Recommended Articles

Get Going with JMP: Essentials for Using JMP

Getting Started with JMP: On Demand Course