Hi
So I have a very large dataset full of 1's and 0's for each column. I am trying to see if there is any correlation between some of the particular columns.
I am trying to correlate data on inside an area vs. outside of the area to see if there are any significant differences between people inside and outside these areas. And I am doing it for many different areas. It is a survey with yes and no answers, so the columns would look something like this with "no" as 0 and "yes" as 1.
Inside area Outside area
0 0
0 0
0 1
0 1
1 1
1 1
1 1
etc. etc.
I am not sure how to go about this. I have tried the "Fit Y by X" option with column 1 as Y, Response and column 2 as X, Factor but the result don't really look right.
Can anyone help with this?
Thanks
Yes, I tried to use the method on a couple of different areas and it looks good!
I was looking at the number of 1's and the are 44 in the control group and 66 in the test, so I was just expecting there to be a larger difference than 22 in order for it to be significant. But as you mention one would also need to consider that there is a difference in total sample size between the two groups, so I can see that now. The p-value of 0.0001 just threw me of a bit, as I was not expecting it to be that low.
But thank you so much, Dan! You have been extremely helpful and I sincerely appreciate it!