Solved: Analysis of a Mixture DOE with stepwise regression

Report Inappropriate Content · Jul 10, 2023 04:28 AM

In Mixture case studies (webcast & JMP documentation library) I notice that backward regression is used to analyse the results. Why is stepwise regression not used with forward regression or all possible models in regualr (non Pro) JMP? Is stepwise not possble because of the special nature of mixture models?

Dan_Obermiller · Jul 13, 2023 11:18 AM

There is a script in the Sample Scripts folder (in the JMP folder under \Samples\Scripts) that does this. The name of the script is "LabelMixturePoints.jsl".

Dan Obermiller

View solution in original post

statman · Jul 10, 2023 09:49 AM

Here are my thoughts, mostly philosophical in nature:

My first thought is to NOT do stepwise for experimental design, ever. The experiment is designed to investigate factors and a predicted model for those factors (e.g., first order, first order plus second order, etc.). The model should be predicted á priori. Stepwise is a procedure to uncover relationships that may not be known or predicted.

Mixture designs, for the most part, are optimization designs. That is, the components of the mixture are already known to be important and now you are trying to find the best "area" of the surface to operate in. You should already know the terms of the model and are optimizing those terms. There may also be a fair amount of collinearity in mixture designs which makes stepwise challenging.

"All models are wrong, some are useful" G.E.P. Box

frankderuyck · Jul 10, 2023 09:57 AM

How then find the best model, backward regression? All possible models in stewise?...

statman · Jul 10, 2023 01:47 PM

The approach depends greatly on where you are in the knowledge continuum (e.g., screening, optimizing, whether you are explaining or predicting). When experimenting in the screening phase, I do indeed recommend starting with the full model (saturated if possible) and removing insignificant terms from there (backwards or subtractive model building). When you are optimizing, the questions are a bit different (like, in general, mixture designs). You should already have a good idea of 1st and 2nd order linear and non-linear. Statistical significance is less important (it has already been established). You are "fine tuning" the level setting.

"All models are wrong, some are useful" G.E.P. Box

TaoTaoTao · Jan 10, 2024 05:49 AM

thanks for your detailed answers, which inspired me to thinking about formulation work.

how you suggest to deal with if there is unknown term of the model, for example: I want discover the effects, especially the 2-way interaction due to the new ingredient in the formulation. Aim of the study is more like screening rather than optimization, would stepwise a good fit in this case?

statman · Jan 10, 2024 10:12 AM

I'm confused by your question. You say "unknown term of the model" and then you say the term is a known new ingredient. I would experiment on the new ingredient with a mixture design and use analysis of mixture designs to assess the effect of the new ingredient and possible interactions. I would not use stepwise in this case. Sorry to be redundant, IMHO, stepwise is used to expose factors that have not been identified (e.g., data mining), not for analysis of experiments where the factors in the study are known.

"All models are wrong, some are useful" G.E.P. Box

Victor_G · Jul 10, 2023 03:56 PM

Hi @frankderuyck,

I agree with the "philosophical" point of view from @statman concerning Stepwise regression, more oriented for "non-designed" datasets in order to uncover some factors and effects in the absence of a-priori model. Traditional model selection techniques using p-values are not useful for mixtures : due to multicollinearity between estimated effects (aliases between effects), standard errors of the estimates are quite large, resulting in misleading and distorted p-values.

For (model-based) mixture designs, Forward Stepwise regression may be possible, but with safeguards and caution (for example, force main effects in the model and no intercept). Two options may be interesting to consider for Mixture designs, and highlighted by Dr. Philip J. Ramsey in one of his presentation, "Analysis Strategies for Constrained Mixture and Mixture Process Experiments Using JMP Pro 14" :

Traditional Forward selection using the pseudo factor method of Miller
Traditional Forward selection using fractionally weighted bootstrapping and auto-validation: SVEM in JMP Pro, Generalized Regression platform (Gotwalt and Ramsey)

Dr Ramsey does not recommend the use of "All Possible Models" for mixtures in part due to the need to force the pure component terms in every model. From a pragmatic and practical perspective, "All possible models" method can be highly demanding in terms of computations (and not very effective), as all possible models with various number of effects are constructed, no matter the hierarchy and heredity between effects. So there is a quite large portion of the models created that may be not very interesting (and relevant) to consider, but that are still created and evaluated in an agnostic and "brute-force" way.

Since there is an a-priori model assumed for model-based mixture designs, backward regression is safer to use with a validation method based on information criterion like AICc (Generalized Regression platform in JMP Pro, or Backward Stepwise in JMP).

Since you're more interested in predictive performance than in factors screening, you can also do "manually" the backward regression, by starting from the full model with all the supposed effects from the model (JMP, Standard Least Squares), and remove terms as long as it helps RMSE (prediction errors) of the model to decrease.

For model-agnostic mixture designs (like Space-Filling designs), the use of Machine Learning methods, efficient and effective in interpolation, may be very useful to build a predictive model on the homogeneously distributed points (but overfitting may happen quickly) : SVM, Neural Nets, k-Nearest Neighbors, Gaussian Process, ...

At the end, validation runs may be necessary in order to validate and estimate the predictive performances of the model.

I hope this additional response will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

frankderuyck · Jul 10, 2023 04:44 PM

I saw in the webinars that the -logworth(p-value) pareto plot is used for the backward regression; if p-values are not suitable for mixtures is this correct? Note: I need to give a training to non-jmp pro users so I can't use generalized, machine learning..

frankderuyck · Jul 11, 2023 03:16 AM

I am analysing a three factor FLA 1, FLA 2 and FLA 3 mixture DOE based on a full cubic Scheffé model. Pushing the standard least squares buttallon and trying to remove low effect parameters using backward regression baseo on the logworht pareto (P-values for mixture analysis?) I get the result below: should I also remove all selected effect or only non-contained effects?

I also did the analysis with "All Possible Models" in Stepwise and get a very nice model based on AICC ranking, maybe a wrong model (will P-value based backward regression generate a right model...?) however very useful to specify optimal output Y settings!

I definitely agree that because of strong collinearity in mixture analysis a validation column is necessary and using generalized regression & actual machine learning tools tools is the right approach; I will have a close look at SVEM. As for my non-pro users DOE training I have to rely on classical methods; "All Possible Models" looks not too bad as shown above? Of course for more complex cases, with > 3 mixture components process effects yielding too many models, a seqential approach is definitely necessary; however, due to strong collinearity, making a good selection of parameters to proceed to next sequential step will be quite difficult.

Victor_G · Jul 11, 2023 4:20 AM

Hi @frankderuyck,

No matter if your DoE is a factorial or mixture one, if there is an assumed model a-priori, you should respect the three principles behind Design of Experiments :

Effect sparsity: The effect sparsity principle actually refers to the idea that only a few effects in an experiment will be statistically significant. Most of the variation in the response is explained by a small number of effects, thus it is most likely that main effects (single factor) and two-factor interactions are the most significant responses in a factorial experiment. It is also called the Pareto principle.
Effect hierarchy: The hierarchy principle states that the higher degree of the interaction, the less likely the interaction will explain variation in the response. Therefore, main effects explain more variation than 2 factors interactions, 2FIs explain more variation than 3FIs, … so priority should be given to the estimation of lower order effects.
Effect heredity: Similar to genetic heredity, the effect heredity principle postulates that interaction terms may only be considered if the ordered terms preceding the interaction are significant. This principle has two possible implementations : strong or weak heredity. Strong heredity implies that an interaction term can be included in the model only if both of the corresponding main effects are present. Weak heredity implies that an interaction term is included in the model if at least one of the corresponding main effects is present.

In your case, when trying to remove these effects, you are in conflict with the effect heredity principle : you can't remove main effect FLA1 and interaction FLA2xFLA3 unless you already have removed all other effect terms containing these terms. So here you should keep them.

As stated before, creating models based solely on p-value for mixture designs may be very misleading.

The second option you tried with AICc (or BIC ?) criterion seems more reasonable, as AICc (or BIC) is an information criterion trying to "balance" the complexity of the model with its accuracy (the lower, the better).

Regarding your screenshots, it seems that models with higher number of terms may be useful in your case, as model number 9 has lower RMSE and lower BIC. It may also correct the residuals plot that seems to show some kind of heteroskedasticity (bigger residuals for higher flavour score).

Hope this supplementary comment will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Re: Analysis of a Mixture DOE with stepwise regression

Recommended Articles

Get Going with JMP: Essentials for Using JMP