I have a quick question; JMP13 pro comes with a Generalized Regression personality. After fitting the data, it is possible to check the residuals using the Diagnostic Bundle. Are the residuals to follow the ANOVA assumptions (i.e. normality, homogeneiety, iid) if I use the Gamma Distribution?
I think you may be intermixing assumptions we would like to make about residuals and their desired properties with what they actually are computationally. The residuals out of any JMP modeling platform are nothing more elegant than the difference between the actual observed response and whatever the predicted value is for that specific observation, based on the model that was fit. So residuals are what they are.
So there is really a two step approach here...
Step 1: Compute the residuals.
Step 2: Analyze residuals for 'lack of fit' or any other suspicious pattern that may call into question the continued use of the model for practical purposes.
We use techniques like residual analysis, many of the the visualizations are embedded in the diagnostic bundle, to test for the assumptions we would like to test for. So there aren't any embedded assumptions per se in the bundle's visualizations. The data is what the data is. You can use your eye and domain expertise to test for normality, non random patterns, outliers etc. Your eyes and domain expertise should trump any F-test or other 'lack of fit' test on the planet.
Adding to Peter's explanation, you can think of each residual (Y observed - Y predicted) as an estimate of the response error. The Generalized Regression model that you used modeled the errors with the gamma distribution. So you assume the gamma distribution, not the normal distribution, when checking the regression assumptions with the residuals.
Thank you very much both for your responses. I am attaching a JMP file with the Generalized Regression (JSL) script. I am wondering, what are the characteristics that I should check of the residuals against the gamma distribution?
Before we get too much further, rather than discuss the residuals, perhaps you can share with us the practical problem you are trying to address? In your file you have a nice balanced, replicated designed experiment with two nominal factors at two levels each. With 4 replicates per treatment combination. One rarely creates a design such as this without some forethought and rationale. Before I start examining 'statistics' and visualizations to evaluate a study that someone else created, I'd like to understand the goal of the study.
I am trying to address if the treatments, fungicide type (X1) and fungus batch (X2), affect the average number of spores (Y). The experimental design is a 2x2 factorial with 4 replicates per treatment combination. The analysis of the residuals do not meet the ANOVA assumptions. In particular, I am concerned that the residuals do not appear normally distributed and have unequal variances, therefore invalidating the conclusions of statistical significance of the treatments.
Transforming the data does not ‘normalize’ the residuals either. Therefore, I am interested in applying any distributions (such as Gamma) that Generalized Regression personality in JMP offers.
The variance is proportional to the response level. You might try a Poisson log-linear regression through the Generalized Linear Models personality since you are counting. Select the Poisson distribution and the canonical log link function. Be sure to include both options: over-dispersion and Firth bias adjustment. Because you are using a DOE, I expect that the opportunity for each run is the same and, therefore, no offset is required.
There are no labels assigned to this post.