Cooking Optimization: Should You Cook Hot and Fast, or Warm and Slow?
Jul 28, 2008 10:15 AM
(NOTE: This is part two of a three-part series on stochastic optimization.)
In my previous post, I introduced stochastic optimization. In this post, I show a real example. This example was reported in the classic text by George Box and Norman Draper: Empirical Model-Building and Response Surfaces (page 32), and JMP's Statistical R&D Director Brad Jones noticed that it works as a great robust process engineering example.
Imagine you are doing serious cooking, but instead of making food, you are cooking up chemicals, perhaps even a life-saving drug. Your cooking pot is really a chemical reactor, and people are going to depend on your product to save lives. The reaction that cooks your chemical product has two big controllable factors:
how hot you cook it, temperature
how long you cook it, time
The reaction you make converts the initial ingredient, A, into the chemical you want, B. But if you cook it too hot and long, the B that you make will turn into another chemical, C. Here is the picture. Remember that we want to maximize the green B, and minimize the blue and red, A and C:
This fits a classic optimization framework that is certain to have a nice optimum that maximizes the yield of B.
Here are the formulas as they are in the JMP table. The yield formula is a function of time and the reaction rates; the reaction rates are also formulas, functions of the temperature. We don’t even have to estimate the parameters theta1 to theta2; they are already known. The reaction temperature is already in Kelvin, so these are basically Arrhenius-type models, well-known to chemists.
So let's optimize. In JMP, we use the Profiler to visualize cross-sections of the response surface for yield, and we use a command there to find the settings that maximize yield. Here, we see that we must cook hot and fast to maximize yield at .621 (temperature at 539.95 degrees, time at .1158).
Another perspective, using horizontal cross-sections, is available with the contour profiler, where we can see various combinations of temperature and time that will produce good yield of at least 60 (unshaded) or 61 (inside the red contour line), with the crosshairs at the optimal settings to produce a yield of .621.
But we can’t really control the temperature or the reaction time exactly. The temperature and time vary, at least in a production situation.
Suppose that the standard deviation of temperature is 1 and the standard deviation of time is .03. In the contour plot, that is represented by the black ellipse, which would contain 95 percent of the variation in the two factors. Notice that the variation on time is going to mean that many batches will fall into the pink zone and fail to achieve even a yield of 60. How bad will it be?
The Profiler has a built-in simulation facility, so we enter the standard deviations there and click the Simulate button.
We have a lower specification limit of .55 for yield, which the Profiler's simulator shows as a red line on the histogram. If a batch fails to achieve .55, it must be discarded. At the current settings for the center of temperature and time, it is producing 4.2 percent bad batches. That is not good.
Let's try other settings. Suppose that I lower the temperature to 535 and then set time to the point that maximizes yield for that temperature. There, my defect rate goes down to around 1.9 percent — much better. So the combinations that maximize a fixed yield do not minimize a defect rate in the presence of variation in the inputs.
Remember my blog post about finding the “flats”? Most optimization ends up on a hill against some component limit. But if we find a flatter place, it will reduce the variation. The definition of flatness is that the slopes are very small or zero in every direction. We can model those slopes (gradients, derivatives). There is a built-in feature of the Profiler to specify that one or more factors are “noise factors” and that the Profiler should model the derivatives of the response surface with respect to those noise factors, and see if it can jointly optimize to maximize yield and minimize the slope. After maximizing this, we see that we are now on a flat area where the gradient is near zero in both directions.
Now we use the simulator to calculate the defect rate. It is 3.3 percent. This is not much different from the fixed optimum – in fact, note that the factor settings are not much different from the fixed-optimal hot-and-fast settings.
Haven’t we landed on a flat spot? Take a look at a surface plot.
The two grids intersect at the current values, and you see that we have landed on a relatively flat spot near the top of the hill. But it is on the top of a fairly narrow ridge. Even though the first derivatives may be small here, the second derivative here is large because the sharp bending leads to a steep drop-off from that point. So we might consider finding a flat spot in a second-degree sense.
But there are better ways to go about finding the stochastic optimum — finding the factor centers to minimize the defect rate. Stochastic programming does this. But stochastic programming is hard. How can we make this simple? The answer will be in my next blog post. It turns out that we can reduce defects an order of magnitude smaller with this technique, so it is very valuable. We need to move from hot-and-fast to cooler-and-slower to achieve this, and there is a great way to find the best settings for this.
UPDATE: The third blog post in this series is also available.