Discussions

stat_mr_h · Apr 17, 2025 05:49 AM

Hello everyone,

I am working on an optimisation DoE to maximize a response Y1 using three factors A, B, and C.

Now, I have a second Y2 response, which is a linear combination of A, B, and C. Y2 = aA + bB + cC (coefficients a, b, and c are known). The goal is to maximize Y1 and minimize Y2.

I have two approaches in mind:

First approach: Build an RSM design for Y1 only. Then, after we have the data, compute the Y2 values using the formula and simultaneously optimize Y1 and Y2 using the prediction profiler.
Second approach: Build an RSM design using Y1 and Y2, then fit the data. However, I will have a multicollinearity problem for the Y2 model.

Since we already have Y2's linear equation, for me, it doesn't make sense to include it in the design with Y1. So for now, I prefer the first approach. But I am not sure if there is anything else I should consider.

Appreciate your thoughts.

Regards,

Victor_G · Apr 17, 2025 4:33 AM

Hi @stat_mr_h,

It seems your response Y2 could be some kind of indicators/price response.

Your design is built independantly of the responses, the design structure depends on your objectives, the factors (factors type, number of levels...), expected modeling complexity (assumed model), and a compromise between experimental budget and precision in the estimation of effects (aliasing structure, number of runs, replicate runs, ...).

You could build a model for response Y1, use the prediction formula Y1 and theoritical formula response Y2 in the Profiler to try to optimize both responses.

As an example on dataset "Bounce Data" with the response "Stretch", I added a column "Price" with a defined equation:

Once a model build on response "Stretch" (Y1), I save the Stretch Prediction formula and Launch the Prediction Profiler Platform with the two formula (predicted Stretch formula and the one calculated for Price in the datatable) to optimize both responses :

This would be the first approach you mention. As you already know the formula for Y2, it doesn't make sense to try to model it.

It's not a problem of multicollinearity, as each coefficient could be determined independantly and precisely (without errors) with the design used (VIF = 1 and Std error = 0):

Hope this response will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

View solution in original post

Victor_G · Apr 17, 2025 4:33 AM

Hi @stat_mr_h,

It seems your response Y2 could be some kind of indicators/price response.

Your design is built independantly of the responses, the design structure depends on your objectives, the factors (factors type, number of levels...), expected modeling complexity (assumed model), and a compromise between experimental budget and precision in the estimation of effects (aliasing structure, number of runs, replicate runs, ...).

You could build a model for response Y1, use the prediction formula Y1 and theoritical formula response Y2 in the Profiler to try to optimize both responses.

As an example on dataset "Bounce Data" with the response "Stretch", I added a column "Price" with a defined equation:

Once a model build on response "Stretch" (Y1), I save the Stretch Prediction formula and Launch the Prediction Profiler Platform with the two formula (predicted Stretch formula and the one calculated for Price in the datatable) to optimize both responses :

This would be the first approach you mention. As you already know the formula for Y2, it doesn't make sense to try to model it.

It's not a problem of multicollinearity, as each coefficient could be determined independantly and precisely (without errors) with the design used (VIF = 1 and Std error = 0):

Hope this response will help you,

Victor GUILLER

"It is not unusual for a well-designed experiment to analyze itself" (Box, Hunter and Hunter)

statman · Apr 17, 2025 02:19 PM

For me, I find it really difficult to deal with these issues "conceptually". It really depends on the situation. I don't know what "proper way" means! For example, there are options:

Perform correlation between the 2 Y's. How are they correlated?

Create a contour plot (response surface) for each Y and overlay them. If the optimums for each response does not coincide, then look for other factors (this means move the space dimensionally away from where it is at).

Develop alternate response variables or possible transformations

Perform simultaneous equation solutions

I agree with Victor, you won't have a multicollinearity problem. Multicollinearity is with correlated x's not Y's. You may have x's that conflict and need to be set differently to achieve both Y1 and Y2 and therein lies the potential issue.

"All models are wrong, some are useful" G.E.P. Box

Discussions

Proper way to implement linear combination in DoE

Re: Proper way to implement linear combination in DoE

Re: Proper way to implement linear combination in DoE

Re: Proper way to implement linear combination in DoE

Recommended Articles