Subscribe Bookmark RSS Feed

Re: Fit Definitive Screening: does Main Effects Plot disagree with model Parameter Estimates

mjoner

Community Trekker

Joined:

Jun 23, 2011

I'm thinking there must be a simple explanation for this but I'm having "statistician's block" today.

In the attached data, I am curious about at least two things.

1. Fit Definitive Screening identifies X__3 as significant and not X__2. I do know that if I use Fit Model to do this that X__2's parameter estimate is much smaller than X__3's, so there is a smaller t-ratio and so X__2 is not significant. That is fine, but seems to contradict the Main Effects Plot shown here:

ME Plot.png

2. Based on the Prediction Profiler, the parameter estimate on X__7 has a different sign than what we see in the main effects plot.

Hypotheses I have considered: (a) Somehow interactions are impacting the main effect estimates (which is why I created a second y variable, X__9), even though I know this shouldn't happen since main effects are clear of interactions in definitive screening designs. Given the same phenomena exist in both X__8 and X__9 I have ruled this out. (b) The presence of categorical factors in the DSD are creating a confounding with the continuous factors. Possible, but design diagnostics on this DSD indicate very minor correlation between the continuous and categorical amounting to only about r=0.1. Also, it's hard for me to see how this would cause X__3 to select before X__2 (since no correlation between these two columns of the design matrix).

Data are attached. Note I have two y variables. The real y is X__8. I created X__9 in a way that removes significant interactions.

1 ACCEPTED SOLUTION

Accepted Solutions
bradleyjones

Staff

Joined:

Mar 30, 2012

Solution

There are a number of issues here.

 

In the main effects plot the line of fit that is shown is the result of the simple linear regression of y on each individual x. If there are no categorical factors, then the slopes for all the factors would be the same whether you did a linear regression or a multiple regression because all the main effects are orthogonal. You have two categorical factors so there are small correlations between these factors and all the others.

 

X3 is correlated with the categorical factors X5 and X7 so that the main effects plot showing the fit of X3 on Y with nothing else in the model is has a coefficient of 0.08 due to aliasing from X5 and X7. Its coefficient in the main effects model, by contrast is 0.22. 

 

X2 is also correlated with X5 and X7. Its coefficient in the main effects plot is 0.157, whereas its coefficient in the main effects model is 0.014. In this case the simple slope of X2 versus Y looks steeper again due to aliasing by X5 and X7.

 

I should point out that your data table is not a DSD. There are correlation between main effects and two-factor interactions. This is because a few of the entries in X5 and X7 got switched between foldover pairs.   

3 REPLIES
bradleyjones

Staff

Joined:

Mar 30, 2012

Solution

There are a number of issues here.

 

In the main effects plot the line of fit that is shown is the result of the simple linear regression of y on each individual x. If there are no categorical factors, then the slopes for all the factors would be the same whether you did a linear regression or a multiple regression because all the main effects are orthogonal. You have two categorical factors so there are small correlations between these factors and all the others.

 

X3 is correlated with the categorical factors X5 and X7 so that the main effects plot showing the fit of X3 on Y with nothing else in the model is has a coefficient of 0.08 due to aliasing from X5 and X7. Its coefficient in the main effects model, by contrast is 0.22. 

 

X2 is also correlated with X5 and X7. Its coefficient in the main effects plot is 0.157, whereas its coefficient in the main effects model is 0.014. In this case the simple slope of X2 versus Y looks steeper again due to aliasing by X5 and X7.

 

I should point out that your data table is not a DSD. There are correlation between main effects and two-factor interactions. This is because a few of the entries in X5 and X7 got switched between foldover pairs.   

mjoner

Community Trekker

Joined:

Jun 23, 2011

So essentially we should ignore the main effects plot if we are running a "DSD" with categorical factors? The "Fit Definitive Screening" reduced model is giving the best estimates we can get, short of collecting more data to break the aliasing?

Highlighted
bradleyjones

Staff

Joined:

Mar 30, 2012

If the categorical factors are not active, then the main effects plot will be useful. However, if the categorical factors have large effects, then you should believe the profiler rather than the main effects plot.

 

It is too late to change this undesirable behavior for JMP 14 but I hope to plot the line using the coefficient from the multiple regression of the main effects in JMP 14.1.