Hi @MetaLizard62080,
It's very complex to try to help you without more context and precision about the importance of this factor (practical effect size, statistical significance, both ?), the design you have created, and the way you have analyzed and refined your model.
You may already have some ideas if you can compare what has changed between initial successful DoE for productivity and this one : factors ? levels ? design ? Model choice ? Objectives ? Model metrics for adjustment/refinement ? ...
However, here are some reflexions, questions and possible explanations, besides the one you proposed :
- Design definition :
- Depending on the factors range differences and the influence of the specific factor on the response, you may have amplified the importance/effect size of this factor compared to the others. You can try to augment your design and reduce this factor's range (and/or expand other factors' ranges), in order to reduce the variability of the response regarding this factor (and/or increase the variability of the response regarding other factors).
- Depending on your objective (factors screening, experimental space exploration, prediction, ...), you might choose a different design, with different sample size, replicate runs, ... Is you sample size high enough for the difference you want to detect for other factors ? Is the design chosen appropriate for your objective ?
- Handling of noise : Are your experiments "noisy" ? If yes, did you use strategies to identify and handle noise coming from different sources : repeatability, reproductibility, noise factors, external influence, etc ... ?
- Is this new DoE created "from scratch" or did you use Augmentation to benefit from the information of historical data ?
- Model building, refinement and selection:
- You mention that after reduction, the adjusted R² was very low. What is your objective related to your model building and selection ? Which metrics do you use and are they related to your objectives ? What were the metrics performances of your full assumed model before reduction ? On which information/basis/metric did you reduce your model ?
Depending on your objective (explainability vs. predictivity or both) and your related metrics of interest (R²/R² adjusted for explainability, RMSE for predictivity, and information criteria for both for example), you might end with different models. You can try to build several models, particularly not to rely only on statistical significance and p-values (you can read the discussion about this topic Statistical Significance). You can also identify and assess if effect size/practical significance of factors/terms in the model are important, based on domain expertise.
- Keep in mind that significance and effect size of factors depend on which factors enter the model. So depending on how you are "reducing" your model and which terms are removed, you completely change the p-values and effect size estimation. You can read more here Significant factor become non-significant after removing non-significant ones
- Models coming from DoE also rely on principles of Effects sparsity, Effects Heredity and Effects Hierarchy. You can try to verify/assess if the model you're building does respect these principles. See Is it possible to have one factor which is not significant in a response surface model but the quadr...
- Did you check the residuals ? Are there any patterns that may indicate a violation of the regression assumptions (Regression Model Assumptions | Introduction to Statistics | JMP) ? Do you have a Lack-of-Fit test available ? Is the response linked to predictivity covering large and very diffferent orders of magnitude ? Depending on your responses on these questions, there may be something to check and correct for the model building, in order to make sure regression model assumptions are met (or if you may need a different regression model with added/removed terms, a transformation of the response, a generalized regression model, etc...). See Problem with RSM fit for a factor
- Are the results relevant for domain experts, and aligned with previous results ?
At the end, your model is driven by the design you have chosen, so it's difficult to help you regarding the model without understanding how and why you have chosen and created the design for your topic.
Any toy dataset showing what you're facing may be helpful for a more precise follow-up.
Hope this first discussion starter might help you,
Victor GUILLER
L'Oréal Data & Analytics
"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)