Discussions

CYLiaw · May 4, 2020 02:20 PM

Hi!

I have a full factorial design with four factors. Three of the factors have three levels and the other one has five levels. So I have 3x3x3x5 total runs with no replicates. I have a few questions as follows:

1. After running the model, it shows significant Lack of Fit. However, there are no replicated data points. How can JMP perform the lack of Fit test then?

2. Since it shows lack of fit, based on my understanding, it implies that more terms should be included in the model. Can I add the quadratic terms in the model? Or the full factorial design is only used for studying the main effects and the interaction effects?

3. I tried adding the quadratic terms in the model, however, the lack of Fit was grayed out after running the model. Why is that? I don't think my model is saturated since there are more data points than the number of predictors.

4. Can a full factorial design with three levels (-1 0 +1) be considered a response surface model because it also has factors at the mid-level? Why can't a full factorial design with factors that have three levels capture the curvature of the response surface?

Thank you!

CYLiaw · May 4, 2020 06:02 PM

Hi @statman,

Thanks for your reply. Yes, all factors are continues with discrete levels. I still don't quite understand your reply to question 3. What do you mean by 'non-linear interaction terms are not considered to be included in the model' and ' non-linear interaction terms are pooled in the MSE'? How and why are the non-linear interaction terms pooled in the MSE?

Best,

CYLiaw · May 4, 2020 06:49 PM

I just came up with another question. Sorry to keep bothering you all. Like what @Mark_Bailey said, I found that after removing one of the factors, I can do the lack of fit test. However, the model still didn't pass the lack of fit test even adding the quadratic terms. The analysis actually makes sense to me, so I don't know why it still shows lack of fit. In this case, is there another way to validate the model? Or the model is completely useless? Or can I divide my full factorial dataset (3x3x3x5=135) into training, validation and testing data to show whether the model is useful or not?

Mark_Bailey · May 5, 2020 05:47 AM

Just to be clear, when you say that "the model still didn't pass the lack of fit test even adding the quadratic terms," you mean that the test is significant.

The lack of fit hypothesis test is not perfect. No hypothesis test is. It is one way that you can use the data to help you decide if the current model is biased. The idea is that the current error sum of squares is either the random deviations in the response (unbiased) or it is a combination of fixed and random deviations (biased). (You do not seem to have any factors with random effects in this case.) The null hypothesis is that the model is unbiased. We now apply the same analysis of variance to obtain another F ratio. The test can fail either way if the estimate of the pure error is too small or too large, or if the assumptions of the test are not met. It might also be the case that the degrees of freedom in the F ratio are too small to make the test reliable.

You can use honest assessment to select the best model among all candidate models. Cross-validation is often unsuccessful for this assessment, though, with a small data set such as yours. It is intended for BIG DATA. You might use the adaptation to small data sets, K-fold cross-validation, instead. This method is about model selection. It is not model validation. This case uses an empirical model based on fitting empirical evidence with an interpolating function. You must validate your choice by selecting the model, predicting new observations (under different conditions than those that were observed in original experiment), and confirming the predictions with new empirical evidence to increase your belief in the selected model. We must work equally hard in science to find evidence that both supports and refutes our theory or model, and adjust accordingly, if we want our model to be realistic.

statman · May 4, 2020 06:51 PM

Sorry for any confusion I have caused. Let me keep it simple...2 factors at 3-levels in an unreplicated factorial is 9 treatment combinations. That means you have 8 degrees of freedom. The "theoretical" model is:

Y=A+B+AB+AA+BB+AAB+ABB+AABB

What is AAB (or ABB)? This is a non-linear (quadratic in this case) interaction. The curvature for A is dependent on levels of B. This is certainly something that can happen in engineering and science. Now, the model you will see using JMP Response Surface Macro (to construct model effects). Is:

Y=A&RS+B&RS+AB+AA+BB

When you analyze this model, you will get a MSE term (the three degrees of freedom (above bold) left out of the model). This is the basis for the F-test.

"All models are wrong, some are useful" G.E.P. Box

Discussions

Full Factorial Design questions

Re: Full Factorial Design questions

Re: Full Factorial Design questions

Re: Full Factorial Design questions

Re: Full Factorial Design questions

Recommended Articles