Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

- JMP User Community
- :
- Discussions
- :
- Discussions
- :
- Correlations with binary data

Topic Options

- Start Article
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Dec 1, 2016 11:26 AM
(4367 views)

Hi everyone,

I have a large data table, and each column has a range check property. If the data is within the range, the value remains, and if it's not within range, the data is removed/missing. I then copy the data table, and for every value missing, I replace it with a 0, and for data that is not missing, I replace it with a 1. So basically, I have a large data table full of 1 and 0s (indicating pass or fail).

I tried the correlation table and I have a few questions.

1) If two columns are entirely filled with 1s, why is the correlation 0? From the equation for Pearson Product Moment Correlation, I would be dividing by 0.

2) If a value in the correlation table is very close to 1 (or -1), does that mean if a value in one column has passed, it is likely that is passed in the other column?

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Natalie,

I assume that 2 columns wirh all 1's just can't calculate the correct correlation since there is no variance.

Concerning your second question, a value close to +1 would indicate that a 1 in one of the columns would predict that a 1 would be in the other column. If you square the Pearson r, you will get the % of variance predicted between the 2 columns. If you have a value close to a -1, you would predict a zero if there in a one in the other column.

Jim

2 REPLIES 2

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Natalie,

I assume that 2 columns wirh all 1's just can't calculate the correct correlation since there is no variance.

Concerning your second question, a value close to +1 would indicate that a 1 in one of the columns would predict that a 1 would be in the other column. If you square the Pearson r, you will get the % of variance predicted between the 2 columns. If you have a value close to a -1, you would predict a zero if there in a one in the other column.

Jim

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Get Direct Link
- Email to a Friend
- Report Inappropriate Content

Re: Correlations with binary data

Thanks Jim! I did some more research on correlations and came to this conclusion.