Discussions

MJZ82 · Mar 26, 2026 12:14 PM

After running a DSD & clicking the "Fit Definite Screening" button, the program outputs the following:

Stage 1 - Main Effect Estimates
Stage 2 - Even Order Effect Estimates &
Combined Model Parameter Estimates
Along with a Prediction profiler.

If there are several main effects listed as significant in the Stage 1 menu as indicated by Prob>|t| being below 0.01, but the Prob>|t| for those same effects are 0.20 or greater in the Combined Model Parameter Estimates menu, how should I interpret that? Does that mean the model needs to be reduced?

In the JMP documentation, it states the Prediction Profiler is for the combined model - is that for the combined model listed in the Fit Definitive Screening table, or is that the reduced form from that combined model?

I'm also wondering - do we always need to select the "Run Model" & manually reduce it down after every analysis?

Victor_G · Mar 27, 2026 4:18 AM

Hi @MJZ82,

You have an increase in the final p-values of the main effects because they are calculated with a different number of degree of freedom than from the first stage only. In the first stage, only main effects are studied, so the number of degree of freedom in the design is higher, hence the use of a threshold of 0.05 to detect main effects.

In the second stage, you already have used some degrees of freedom to include the main effects in the model, so higher order effects are included in the model with a higher threshold p-value (often 0.2) because there are less degree of freedom left to test them in the same way as main effects in stage 1. You can read more about the methodology here: Statistical Details for the Fit Definitive Screening Platform
In the final combined model panel, the p-values are calculated as if the terms from this model would have been entered simultaneously in a simple least squares model, hence different (and higher) p-values than calculated before in each of the two stages. The profiler displayed at the end corresponds to the final combined model.

I would not reduce the model further based on p-values from the combined model parameter estimates panel, as the methodology used for Fit Definitive Screening is based on a sequential approach : first identifying main effects influence on the response, then identify active 2nd order effects based on the residuals from the main effect model fitted previously. The p-values calculated with a simultaneous inclusion of the terms (like in the combined model parameter estimates panel) do not reflect what has been done to identify and enter model terms, so they are not reliable, may be biased and the decision to remove some terms should not be based solely on this information. It's also important to respect Effects Heredity when fitting models from DoEs, so do not remove a main effect of a significant interaction or quadratic effect of the same factor is in the model.

You can still try different models using the different modeling platforms available: The Fit Definitive Screening Platform, The Fit Two Level Screening Platform, Standard Least Squares Models, Generalized Regression Models, ... and compare the common terms identified by those methods and the one that differ. The topic of modeling is a lot more vast (and sometimes complicated) than "only" relying on p-values. Depending on your objective(s), you may have different paths to models evaluation and selection :

Explainative model : In an explainative mode, you're more focussed on the terms that do have some influence on the response(s), so you might evaluate the need to include the different terms based on statistical significance (with the help of p-values and a predefined threshold for it like 0.05) and practical significance (size of the estimates, selection based on domain expertise). R², R² adjusted (and the difference between the two, which needs to be minimized) might be good metrics to understand how much variation is explained by the identified terms, and select relevant model(s) to explain your system under study.
Predictive model : In a predictive mode, you're more focussed on the terms that help you minimize prediction errors, so you might evaluate the need to include the different terms based on how this improve the predictive performances, through the visualizations of actual vs. predictive plot, and size of the errors (residuals plot). RMSE might be a good metric to assess which model(s) have the best predictive performances (goal is to minimize RMSE).

You might also be interested by a combination of the two parts, so different metrics could be used to help you evaluate and select model's, like information criteria (AICc, BIC) that help find a compromise between predictive performances of the model and its complexity. To evaluate and select a model based on these criteria, the lower the better.

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

Victor_G · Mar 27, 2026 4:18 AM

Hi @MJZ82,

You have an increase in the final p-values of the main effects because they are calculated with a different number of degree of freedom than from the first stage only. In the first stage, only main effects are studied, so the number of degree of freedom in the design is higher, hence the use of a threshold of 0.05 to detect main effects.

In the second stage, you already have used some degrees of freedom to include the main effects in the model, so higher order effects are included in the model with a higher threshold p-value (often 0.2) because there are less degree of freedom left to test them in the same way as main effects in stage 1. You can read more about the methodology here: Statistical Details for the Fit Definitive Screening Platform
In the final combined model panel, the p-values are calculated as if the terms from this model would have been entered simultaneously in a simple least squares model, hence different (and higher) p-values than calculated before in each of the two stages. The profiler displayed at the end corresponds to the final combined model.

I would not reduce the model further based on p-values from the combined model parameter estimates panel, as the methodology used for Fit Definitive Screening is based on a sequential approach : first identifying main effects influence on the response, then identify active 2nd order effects based on the residuals from the main effect model fitted previously. The p-values calculated with a simultaneous inclusion of the terms (like in the combined model parameter estimates panel) do not reflect what has been done to identify and enter model terms, so they are not reliable, may be biased and the decision to remove some terms should not be based solely on this information. It's also important to respect Effects Heredity when fitting models from DoEs, so do not remove a main effect of a significant interaction or quadratic effect of the same factor is in the model.

You can still try different models using the different modeling platforms available: The Fit Definitive Screening Platform, The Fit Two Level Screening Platform, Standard Least Squares Models, Generalized Regression Models, ... and compare the common terms identified by those methods and the one that differ. The topic of modeling is a lot more vast (and sometimes complicated) than "only" relying on p-values. Depending on your objective(s), you may have different paths to models evaluation and selection :

Explainative model : In an explainative mode, you're more focussed on the terms that do have some influence on the response(s), so you might evaluate the need to include the different terms based on statistical significance (with the help of p-values and a predefined threshold for it like 0.05) and practical significance (size of the estimates, selection based on domain expertise). R², R² adjusted (and the difference between the two, which needs to be minimized) might be good metrics to understand how much variation is explained by the identified terms, and select relevant model(s) to explain your system under study.
Predictive model : In a predictive mode, you're more focussed on the terms that help you minimize prediction errors, so you might evaluate the need to include the different terms based on how this improve the predictive performances, through the visualizations of actual vs. predictive plot, and size of the errors (residuals plot). RMSE might be a good metric to assess which model(s) have the best predictive performances (goal is to minimize RMSE).

You might also be interested by a combination of the two parts, so different metrics could be used to help you evaluate and select model's, like information criteria (AICc, BIC) that help find a compromise between predictive performances of the model and its complexity. To evaluate and select a model based on these criteria, the lower the better.

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

MJZ82 · Mar 27, 2026 09:25 AM

Thank you for the detailed response.

What if there is only 1 term in my "Combined Model Parameter Estimates" that's shown to be significant out of 9 terms listed in that model (not including the intercept)?

My "Stage 1- Main Effects Estimates" lists 3 main effects out of 5 as being significant. Then in my "Combined Model Parameter Estimates" lists those same 3 main effects + 3 interaction terms + 3 second order terms. Out of those 9 combined terms, only 1 of the 2nd order terms ia significant, and many of the other terms with p-values of 0.20 or higher.

Is it usually not recommended to try and further reduce the model output given after running the "Fit Definitive Screening " platform, even if I have 8 out of 9 of my terms in the "Combined" model as not having singificant p-values?

Victor_G · Mar 27, 2026 7:38 AM

Like I wrote, the calculation of p-values in the combined model is different (simultaneous testing) than the methodology used in Fit Definitive Screening , based on sequential and hierarchical testing of effects. This Fit Definitive Screening may be more sensitive than the classical way of testing and model building, so you might end up with non statistically significant terms in this final combined model panel even if the stage 1 or 2 have detected them.
That doesn't mean these terms are not practically important, or that they're not useful for the model.

It's important when evaluating several models to verify regression assumptions (through residual analysis), as well as different metrics (to compare the models with different evaluations) and domain expertise to guide the modeling process.

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Discussions

Fit Definitive Screening Platform confusion

Re: Fit Definitive Screening Platform confusion

Re: Fit Definitive Screening Platform confusion

Re: Fit Definitive Screening Platform confusion

Re: Fit Definitive Screening Platform confusion

Recommended Articles