Hello!
I am using JMP 13 in my master's thesis in order to build a statistical model. I am new at building models, so I apologize in advance if my questions seem a little basic, but any kind of help would really mean a lot to me.
In my master's thesis, I did a Definitive Screening Design (6 factors + 2 fake factors, 7 responses, 17 runs) in order to study some process. After completing all the experiment and analysis necessary, I analyzed the gathered data with two analytical methodologies proposed by Jones and Nachtsheim in 2016 (Effective model selection for DSDs (EMS) method or Forward stepwise regression (FSR)). They say that EMS is generally preferred, but FSR performs adequately if the number of active effects is no more than half the number of runs and there are at most two active two-way interactions or at most one active quadratic effect (DOE Guide p. 257). These conditions hold for all of our responses.
I have some questions regarding selecting factors for model building with the FSR method. I did the FSR with the following conditions:
Stopping Rule: Minimum AICc
Direction: Forward
Rules: Combine
FSR chose some factors and I made a model with standard least squares.
So here is my first question:
In Effect Summary (See the photo Effect Summary 1 below) I noticed, that factor Temp (5,25) has a PValue 0,16260. I know that this factor is included because of the combine rule (Temp. is included in interaction Konc.lakt*Temp. which has a PValue under 0,05). Is it necessary for the model that Temp. is included or can I build a model without it, although Temp. is included in interaction? If Temp. is removed, JMP keeps warning me, that Temp. is missing.
I also noticed that it doesn't change the model much if I delete Temp. as a factor, which makes sense because of the high PValue, which I understand as a sign that the factor is insignificant. (or am I wrong?). The PValues of other included factors changed as well, so now factor Konc.lakt also had a PValue larger than 0,05 (should I excluded it as well?).
I compared the models. (See photos below: Actual by predicted plot 1 for reduced model and Actual by predicted plot 2 for not reduced model). I compared the model with all the effects on the picture Effect Summary 1 and the model with only factors which have PValue lower than 0,05 (See photo Effect summary 2). The models seem very similar, the model with more included effects has a little better fit (which makes sense), but not very much, residual analysis is also very similar and points from additional experiments, which were not used to make a model (blue, purple and green stars) are in the similar positions. Is it right, that I keep the less active effects in the model? I don't think so, since these effects don't change the model much, so including them could be misleading and we could overfit the model. Or am I perhaps wrong?
If I look at another response, where I get only one factor with value lower than 0,05, then adding another factor to the model improves model a bit more than in previous example, (better fitting, better residual analysis) but still not very much (See photo below: Actual by predicted plot 3 for not reduced model and Actual by predicted plot 4 for reduced model). Do things change if we are left with only one factor in the reduced model?
Thank you for all your answers and effort in advance.
Danijel