My ANOVA analysis indicates that there is no statistical difference between groups. However, when I plot as a simple chart, there is obviously a downward trend to the data. Which do I "believe"?
You are making a critical error when you say "My ANOVA analysis indicates that there is no statistical difference between groups." Don't feel bad. You are not alone. There is a tempest swirling around significance testing in science now, specifically regarding p value thresholds.
The correct statement is along the lines of "My ANOVA analysis barely fails to reject the Null hypothesis." It does not say "There is no difference"...there clearly IS a difference; you can see it. There is not enough evidence to detect a significant difference given these data, although it is very close.
You should design the test to be powerful enough to statistically detect the difference about which you technically care. If you care about the difference you see with your eyes in these data, believe that. And next time, design a balanced test that provides adequate power to detect that difference a priori.
Look at your connecting letters report and your LSD threshold matrix further down the page from what you are showing above. From what I can see you have at least 4 groups that are significantly different than 0391-0420 based on the overlap or lack there of the mean diamonds at the at upper and lower overlap marks. The R-square also indicates there is some difference in the different groups. If there was little difference the R-square would be closer to zero.
If there really is a downward trend, a regression model should be more effective for detecting it: Y versus a continuous variable such as individual age or group mean age.
Have you place a fitted line through the data with the 95% confidence band? If the line can be moved within the band into the horizontal position, then the "trend" is not significant - this is often the case. So-called "trend lines" are often abused. You might also look at the "connecting letters report" - very useful.
Perhaps I'm incorrect, but as a consulting statistician and a card-carrying member of the American Statistical Association, I feel I must strongly object to the direction I sense this thread may be taking. Before we all start tripping through the Garden of Forking Paths, perhaps we should just take a breath, stop recommending different procedures, and understand some particulars about the data and the science behind it.
Where do the data come from? What is "Y"? Are they happenstance data, or the results of a designed experiment? How was "Y" measured? Does the measurement system discriminate acceptably? How were the groups chosen? From where does the imbalance in the group sample sizes arise? There appear to be outliers in the data; are they correct and understood? What is a technically significant difference? What kind of decision are you trying to make from the results of this analysis? Is the objective "data mining" or inference? Was an ANOVA the planned analysis from the beginning?
I hope no one takes offense. None is intended. I think we all want to try to help. But the answers to the questions (and more!) can mean the difference between help and harm.
To answer your questions for a better understanding of your thought process:
· Data comes from actual measurements, completed on scales following removal of the equipment from the process after failure
· “Y” is metal loss in the process
· Discrimination is acceptable
· The groups sizes were selected arbitrarily
· Imbalance in the group sizes comes from natural failure of the equipment, requiring replacement
· The outliers are correct, but not clearly understood
· Due to the high cost of the metal, differences of .005 or more are significant
· I’m trying to determine a rate of metal loss for budgeting purposes
· ANOVA was not the planned analysis, but was used to make some inferences about the process
Any additional help would be appreciated.
940 Washburn Switch Rd.
Shelby, NC, USA, 28150
Tel: 704-434-2261 ext. 2374