You didn't state what the source of your data is? Is it from an experiment or from observational data?
A couple of points to keep in mind when developing models to explain the variation in responses:
- Each coefficient for each term in the model is conditional. Its magnitude, and sometimes the sign, can change. They depend upon the other variables in the model (and noise).
- When you remove terms from a model, their cumulative mean squares are pooled in the error term (along with DF's). This changes the estimate of the mean square error which is the basis for the F-test and subsequent p-values.
Simplifying models requires use of multiple sources of information:
1. Practical significance. Graphing the results and understanding how much practical significance each factor contributes is ALWAYS more important than statistical significance (which YOU control in how your designed study provide insight into the random variation). Always ask if the results make sense from a scientific or engineering perspective.
2. R-square, R-square adjusted and most importantly the delta between those.
3. p-values
4. other statistics depending on the source of the data (e.g., for observational data you may want to look for multicollinearity)
"All models are wrong, some are useful" G.E.P. Box