We use objective methods (e.g, design of experiments, analysis of variance, linear regression) that involve subjective decisions. What level of alpha should I use for my inference? Do I also consider the importance of a factor as well as its statistical significance? Also, there are many statistics and criteria available besides the F ratio that is testing the significance of the whole model. Which ones should I use? How do they work and how do they inform me?
So a p-value of 0.0657 for a F ratio of 4.1521 with only 5 degrees in the denominator is not an especially a strong hypothesis test. 9 DF were given up to fit the whole model. If I eliminate unimportant and insignificant terms (ht*temp, ht*gl ratio, ht*ht, and gl*temp), the F ratio is 11.2338 with 9 DF in the denominator, almost twice as many and the p-value is now 0.0012. The R square and adjusted R square a more in line with each after reducing the model. The first model was over-fitting and that, among other things, weakened the test.
It is the age of 'big data' but we must still strive to use all the methods developed for small data sets such as this example with careful consideration. There is no Easy button to be had.