Hello, I'm very new to stats analysis and have what's probably a very basic question..
I'm having trouble interpreting the results of a logistic regression analysis I've done with some simple data.
I have a dataset of oak trees in two different forest types and am asking how size and forest type affect fate (undamaged vs. damaged) after a hurricane event.
So, Y = fate, X = size (continuous variable) and forest type (two categories, young forest vs. mature forest).
When I run the regression with both variables included, neither is significant. I also ran it with an interaction variable (size * forest type), and again, nothing is significant.
This was somewhat surprising, because based on the data I expected both variables to be significant (or close).
So then I ran a logistic regression with one variable at a time, just to see what happened. Size was not significant. With just forest type as a variable, though, it WAS significant, with trees that are damaged being more likely to be in mature forests.
I'm having trouble wrapping my head around what this means. Size and forest type are correlated (size is significantly larger in mature forests) and there are ecological reasons that both of these would influence fate, strongly. I don't understand why only forest type is significant, unless there is variation between forest types (other than size) that is very important here - which could very well be the case - but then why isn't forest type still significant when added into the model with size?
Is it that size is influencing fate, just not enough to be statistically significant, and so when it's included it is explaining some of the variation, and this is kind of washing out the signal from forest type?
I obviously don't understand enough about the internal workings of LR models... any advice is appreciated!