Discussions

Michael_Mart · Jun 8, 2023 2:02 PM

I was wondering if I can rely on my DoE model to predict the value of a response when the factor values are outside of the tested range. I designed a full factorial DoE using three factors, each at two levels and two centre point runs. The experimental design was also replicated (20 total runs). For one of my responses, I established a good model and am performing Monte-Carlo simulations to make predictions around the response. I also have data from univariate runs which were not part of the DoE and investigated factor values outside of the DoE range. Can I rely on my DoE model to confirm and explain the results from my univariate runs?

calking · Jan 14, 2021 10:12 AM

Hey @Michael_Mart!

In general, extrapolation in DOE can be dangerous primarily because of the following reasons:

Your uncertainty grows very fast once you leave the design region.
You are assuming the model still holds beyond the design region (i.e. the underlying process does not change significantly outside the design region).

That second reason is the main one for why extrapolation is usually discouraged. It's very easy for the underlying process to change just outside the region of your experiment. My background is in accelerated life testing, which is built on the idea of extrapolating outside of the design region and so is the exception to the rule. Even so, the assumption that the model still holds is ever present, to the point that we will adjust our designs to protect against it.

Of course, one way you can "test the waters" is to collect data outside the design region, record the responses, and then use that data to test your model. That sounds very similar to your situation so, if it's true, you're in a great position! Rather than using the model to validate those points though, you'll be using those points to validate your model, which is always the more appropriate way to look at it.

A common way you can test your model is to compute a quantity called the Mean Squared Predictive Error (MSPE), which is basically taking the sum of the squared differences between the observations you saw and what the model predicts at those same points and then taking the average. If the MSPE is relatively small (maybe not much larger than the Monte Carlo simulation variation), that could indicate your model seems to be performing well at those points. In that case, you might be justified in extrapolating to those points.

View solution in original post

calking · Jan 14, 2021 10:12 AM

Hey @Michael_Mart!

In general, extrapolation in DOE can be dangerous primarily because of the following reasons:

Your uncertainty grows very fast once you leave the design region.
You are assuming the model still holds beyond the design region (i.e. the underlying process does not change significantly outside the design region).

That second reason is the main one for why extrapolation is usually discouraged. It's very easy for the underlying process to change just outside the region of your experiment. My background is in accelerated life testing, which is built on the idea of extrapolating outside of the design region and so is the exception to the rule. Even so, the assumption that the model still holds is ever present, to the point that we will adjust our designs to protect against it.

Of course, one way you can "test the waters" is to collect data outside the design region, record the responses, and then use that data to test your model. That sounds very similar to your situation so, if it's true, you're in a great position! Rather than using the model to validate those points though, you'll be using those points to validate your model, which is always the more appropriate way to look at it.

A common way you can test your model is to compute a quantity called the Mean Squared Predictive Error (MSPE), which is basically taking the sum of the squared differences between the observations you saw and what the model predicts at those same points and then taking the average. If the MSPE is relatively small (maybe not much larger than the Monte Carlo simulation variation), that could indicate your model seems to be performing well at those points. In that case, you might be justified in extrapolating to those points.

statman · Jan 14, 2021 10:26 AM

@calking makes some great points. Just to add...extrapolation of your model is an engineering or managerial decision, not a statistical one (see Deming enumerative vs analytical problems). It is greatly influenced by how representative your study is of the future. What do you mean by "good model"? Did you run randomized replicates or blocks? I personally am not a fan of simulation to test models as I don't trust the simulations to represent the real noise in the system. Your OFATs (I love that you call them univariate runs, but univariate typically corresponds to the number of response variables: multivariate means multiple Y's) can certainly be assessed against the model. Make sure you analyze the residuals when comparing data obtained from tests run outside of the experiment.

"All models are wrong, some are useful" G.E.P. Box

Discussions

Can I extrapolate using a DoE model?

Re: Can I extrapolate using a DoE model?

Re: Can I extrapolate using a DoE model?

Re: Can I extrapolate using a DoE model?

Recommended Articles