Hi Everybody,
I need help to understand and address the following issue.
I'm building a model to predict the development of cardiovascular disease in a cohort. I’ve included two variables that are correlated with each other (R² = 0.83). When I include both in the model, one of them flips direction—it becomes protective instead of a risk factor. But when I include each variable separately, both show a positive association with disease, as expected.
I don't think this is due to a biological explanation—it seems more like a multicollinearity issue. I know I could use either VIF or PCA to address this. But since my outcome is binary, I’m unsure how to correctly calculate VIF. I tried running a least squares regression between the two variables and got a VIF of 1, but I don’t think that’s the right approach here.
Has anyone dealt with a similar situation? How would you recommend addressing this?
RZ