Subscribe Bookmark RSS Feed

Correlation Analysis


Community Trekker


Jun 23, 2011

In the past, I have used JMP to compute the correlation coefficient between two variables.    I have this new data set where logically, there is a 1 to many relationship.  A record  can have as many as 20 unique values for Column z.  Rather than stringing all the values together in Column z, it was decided that a NEW record would be created for each separate value to be loaded in column z.   So,if the record must have the values A and B, two records are created, with column z loaded with A and B. 

Does this have any bearing on the computed correlation coefficient, as you now have additional rows in the data set?  would it be better to create a separate column that identifies the existence of a specific value?  So, if I am tracking occurrences of "A", i would create a new column called "A exists" and populate the fiield wih Y or N???