Discussions

RandomSquirrel1 · Apr 3, 2026 12:34 AM

Hi, I’m pretty new to JMP and had a question about interpreting regression vs orthogonalized effects.

I ran a 3-factor, 3-level DOE (X, Y, Z) using simulations to look at main effects and interactions on a response (force). I was missing a few runs, so the design isn’t fully orthogonal.

I used Fit Model to run a regression and got the usual outputs (Estimate, t Ratio, Prob>|t|). I also looked at the Effect Screening / Pareto plot, which uses orthogonally coded effects.

I’m a bit confused about what I should actually be using:

For deciding whether an effect is significant, should I rely on the regular regression output (p-values), or the orthogonalized ones?
For reporting how large an effect is on the response, is it better to use the regression coefficients, the orthogonal coded values, or something like the predicted change across the factor range?
When JMP reports orthogonally coded effects, are those still based on the centered (X − mean)(Y − mean) type terms, or does the orthogonalization change the reference point used for interpreting interactions?

I just want to make sure I’m interpreting everything correctly given that the design isn’t perfectly orthogonal.

Thanks!

Victor_G · Apr 3, 2026 4:08 PM

Hi @RandomSquirrel1,

Welcome in the Community!

With the limited information and context you have provided, it's difficult to precisely help and guide you. Here are some other questions to help better understand your topic and start the discussion:

What is the objective of your study ? Identify specific effects/terms, predict a phenomenon, optimize a response, explore an experimental space, ... ?
What type of design have you created ? With 3-levels for each of your 3 factors, it sounds like you may have created a response surface model design for continuous factors, like an I-Optimal, CCD or Box-Behnken, or a full/fractional/optimal factorial design for categorical factors. How many runs do you have in this design ? Any replicate runs ?
How many missing runs do you have and what are their locations ? What is the reason behind the missing runs (impossible/unfeasible to do the experiment or measure the response ?) ?

I would first recommend to evaluate the design you have with the missing runs (excluded rows), and compare it with your full intended design, to better assess the possible impact of the missing runs. For this, you can use the platform Compare designs and compare the design tables with and without the missing runs. This will help you figure out the changes in the aliasing structure, and any possible change in the relative estimation efficiency of the terms in your model.

Then, once you have assessed the design situation, you can also evaluate how much colinearity the missing runs have created in the estimation of your model's terms by looking at Variance Inflation Factor's values (VIFs) and standard errors in Parameter Estimates panel. The VIF values and standard errors will help you assess how reliable can be the terms estimates (and associated p-values) in your model in the absence of some runs.

I would also compare the orthogonalized terms effects and p-values with the regular ones. Do you see a practical important change in parameter estimates between the two outputs ? Are the p-values similar ? Are any of the effects of practical importance on the response ? Practical importance and statistical significance are both to be considered.

For your question about the reporting, the answer is also linked to your question and objectives. What were you trying to find/study ? Which of these changes has more meaning for your study ?

Concerning your last question, the orthonormalization process uses original estimates and scale them using the covariance matrix and a scaling factor linked to the number of rows and RMSE. If your factors are well defined with the Coding column property, then the orthonormalized estimates will be based on the scaled estimates (not centered). Please be careful as these orthonormalized estimates are dependant on the introduction order of the terms in the model, so make sure to specify them by respecting effect heredity.

Hope this discussion starter may help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

RandomSquirrel1 · Apr 3, 2026 09:11 PM

Hi Victor,

Thank you for your response.

To provide more context, my study looks at how the geometric positioning of an anchor (X, Y, Z) affects an injury metric. The goal is to identify which directions are most influential and quantify how changes in each direction affect the response (e.g., moving the anchor from its maximum to minimum position).

I treated the factors as continuous, with levels coded as −1, 0, and 1, and fitted a first-order linear regression model using least squares. Each configuration was simulated once. The response metric ranged from 40 to 55.

Due to physical infeasibility, some runs were missing at extreme locations:

• First DOE: 3 missing out of 27
• Second DOE: 7 missing out of 27

The fitted model was:

Injury = β₀ + β₁X + β₂Y + β₃Z + β₄XY + β₅XZ + β₆YZ + β₇XYZ

Model diagnostics:

VIFs ranged from ~1.0 to 1.35, indicating low multicollinearity
Standard errors ranged from ~0.4 to ~1.1

The outputs I am trying to interpret are:
• Identifying which main and interaction effects are significant
• Estimating the magnitude of each effect (e.g., change in response from −1 to +1). My understanding is that, for coded variables, this would be approximately 2 × the regression coefficient.

When comparing the regular regression output and the orthogonalized results, the significant effects are generally consistent, although in a few cases an interaction term shifts slightly around the significance threshold (e.g., p = 0.048 vs 0.051).

Given this, would it be appropriate to use the standard regression output (Estimate, t Ratio, Prob>|t|) for determining significance and interpreting effect magnitudes?

Appreciate your help!

Victor_G · Apr 4, 2026 01:40 AM

Hi @RandomSquirrel1,

Ok thanks, the information and results you added help a lot to understand the context.

Given the number of factors, levels and total number of runs in your designs, I assume you have done Full Factorial designs . These designs are testing every combination of the factors levels, and therefore are quite robust to (some) missing values. This diagnostic seems to be enforced by the results you obtained. Given the low VIF values and the small difference between orthogonalized and regular values in your design, using the standard regression output seems acceptable.

Yes you're right, parameter estimates values for your main effects represent the change in response value when the factor goes from 0 to 1 (or -1 to 0, one unit scale), so half of your range, when the other factors are constant/fixed.

Some further comments:

Please verify that your model does respect regression assumptions by checking the absence of patterns in the row diagnostics plots: the actual by predicted plot and the residuals plots. This is to ensure your model fits adequately your data, and that you're not missing terms in your model (for example you may need to add quadratic effects in your model, like X*X, Y*Y and Z*Z), or you do not need to use any specific transformation (like Box-Cox transformation) to your data.
Use several metrics to ensure your model is adapted to your data. Individual p-values terms significance is a good start, but I wouldn't remove one term in the model based only on this information. Look at the Summary of Fit, and particularly the RMSE (predictive accuracy), R2/R2 adjusted (and the difference between the two, to be minimised), the statistical significance of the whole model, and the information criteria AICc/BIC metrics (to be minimized) that help to understand the balance between the model's complexity and its accuracy. If you have some terms that are close to statistical significance, I would keep them in the model.
Try to validate your model by doing some validation runs that will help you confirm the precision of your model is adequate; this will also increase trust in the calculated model terms.
If you want to run the model validation by avoiding extreme locations or if model's validation is not successful and you may need additional runs to refine the model and confirm its adequacy, while respecting your constraints, you can use the platform Augment Designs to create new runs in your design and specify factor constraints so that these new runs may not be created at extreme locations.

Hope this answer and possible next steps will help you complete your study,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

RandomSquirrel1 · Apr 4, 2026 01:48 PM

Hi Victor,

Thanks so much for the recommendations, I'll be cautious about removing terms from the model based solely on statistical significance.

I had a question regarding the parameter estimates in the model output. For the interaction terms, instead of seeing something like XY, the model reports terms in the form (X − 0.0833)(Y + 0.0833). I’m assuming this is due to the software recentering the variables.

Would this recentering affect how the main and interaction effects should be interpreted, or is the interpretation of the coefficients still consistent with the original coded variables?

Thanks!

Victor_G · Apr 4, 2026 02:21 PM

Hi @RandomSquirrel1,

Centering is helpful for higher order terms, like interactions and polynomials, to reduce collinearity between low-order and high-order effects.

There is no change in interpreting the magnitude of the centered effects compared to non-centered effects, but the interaction term value will be equal to 0 when the factors X and Y are set at their mean values, unlike for non-centered interaction effect which value is equal to 0 when both factors are at 0, which sometimes can be out of the studied factor ranges. You can read more about this topic in this discussion: Centering IVs in regression only in interaction

Hope this answer will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

statman · Apr 4, 2026 10:50 AM

Just a couple of comments/observations (in no specific order):

To help with context, can you explain why X, Y or Z would impact the injury metric (hypotheses)? If you were to predict rank order of model effects, what would that rank order look like?

1. You say you fitted a first order model...not exactly, you included 2nd and 3rd order interactions in your model. There are both factorial and polynomial order.

2. You tested each variable at 3 levels which means you can also estimate the quadratic. Why did you leave these terms out of the model? Or if you weren't interested in non-linear, why test at 3 levels?

3. Your response variable is an "injury metric". I have no context for this response. What is it? The variation created in your simulated experiment went from 40-55, is this of any practical value? How much of a change in that metric is meaningful? If the metric changed by 1 unit would you care? How about 5? 10?

4. You are running a simulation. You realize the algorithm (model) already exists. Are you trying to uncover the algorithm? Why is there a physical infeasibility in a simulation?

5. What do you mean first and second experiment? Are these replicates? What changed between replicates? Why would 3 be unfeasible in the first experiment and 7 are unfeasible in the second experiment if they are replicates (identical treatment combinations)? Are any of the missing runs from the first available in the second (or visa versa)?

6. In general, your analysis should first determine practical significance, if that criteria is met, then graphical and lastly quantitative. Are there any unusual data points? Quantitative analysis should start with a saturated model (although not possible with randomized replicates). Understand what is making up the mean square error estimate (what factors or noise). Without some knowledge of how the error is being estimated and whether that error is representative of real world variation, p-values are of little use. R-square Adjusted is the default statistic to maximize and you want to minimize R-square- R-square Adjusted delta (reduce over-specification).

By the way, if you have challenges of executing extreme vertices, Box-Behnken is the recommended design.

"All models are wrong, some are useful" G.E.P. Box

RandomSquirrel1 · Apr 4, 2026 01:23 PM

Hi statman, thanks for the detailed feedback, I appreciate the time you took to go through this.

To clarify a few points:

I was mistaken in writing it was a first order linear model; it is a linear model with interactions. The use of three levels (−1, 0, 1) was intended to better represent the design space and allow detection of potential nonlinearity, while also improving estimation of both main and interaction effects across the range of anchor positions. The midpoint was intended to contribute to the estimation of the linear and interaction coefficients and to provide a check on the linearity assumption.

Regarding practical significance, the response is sternum deflection (mm), which is directly linked to injury risk (e.g., rib fractures). Even relatively small changes in this metric can correspond to meaningful increases in injury probability, so both statistical and practical significance were considered when interpreting the results.

The two DOEs correspond to different anchor configurations rather than replicates, which is why the missing runs differ between them. The excluded configurations were at extreme locations that resulted in physically unrealistic or invalid setups, so they were treated as infeasible.

I will look more closely into the model diagnostics you mentioned, including residual behaviour, model fit metrics (RMSE, R²/R² adjusted), and the role of the error term in the absence of replication. This should help further assess whether additional terms (e.g., quadratic effects) or model refinements are warranted.

Thanks again for the suggestions regarding model diagnostics and practical interpretation.

Discussions

Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Re: Finding Effect Magnitudes for Non Orthogonal DOE

Recommended Articles