A Mixture/Process Experimental Design and SVEM Analysis for an Esterification Reaction
An experimental design was created to study the formation of an unwanted byproduct in an esterification reaction. Four mixture component factors plus one process-based factor were used to generate a 26-run space filling experimental matrix, specifically for analysis using Self-Validated Ensemble Modelling (SVEM). This approach was selected over a traditional mixture design intended for a polynomial Scheffe model. The resulting predictive model was an excellent fit to the data, clearly identifying the impact of each factor on the level of byproduct formed. This information was used to accelerate the development of a kinetic model and scale up the process.
Hello, I'm Andrew Fish. I'm a Senior Principal researcher at Johnson Matthey. I'm a chemist by background. I've been working with JMP since 2016. My poster today is a mixture: process experimental design and SVEM analysis for an esterification reaction.
As a way of introduction, so within the catalyst technologies business of Johnson Matthey, we sell catalysts and floor sheets for a range of different technologies, and that lends itself quite to design of experiments and to advanced data analytics. We develop catalyst formulations, and we optimize process conditions for chemical reactions. What we do is we work at small scales in the laboratory, and we translate that to commercial-sized reactors.
The reaction which we're going to focus on for this poster is an application of Fish esterification reaction. I don't want to dwell on the chemistry because that's not the focus of this poster. But just by way of background, what we have is a reaction where we are taking an acid and react it with an alcohol in the presence of an acid catalyst to form an ester product and water. This is a reversible reaction, so it can go both ways. If we don't get the conditions right, we're not going to end up maximizing the amount of our ester product.
The added complication we have here is that we have a side reaction. This is reaction 2, where the alcohol can react with the same acid catalyst to produce an ether. That reaction is an irreversible reaction, which means that if the ether is formed, we can't get our alcohol back. We've consumed our reactants, and it means the reaction just isn't as efficient as it can be.
What we're trying to do in this process is minimize the amount of this byproduct, ether, and try and maximize the amount of the vesta that's formed under these conditions.
To do that, we used a design of experiments. Normally, we would tend to use a generic mixture design with one process factor. We're going to take a slightly novel approach of using a spaceful and design here. To summarize the factors, we have our continuous factor, which is a temperature. We then have four traditional mixture components, which is the amount of alcohol, the amount of acid catalyst, the amount of and the amount of water. The final component is the original acid, but we're going to fix that value at 25% of the mixture, which means the remaining four mixture components have to sum to a total of 75%.
Then what we're looking for is the amount of ether that we're producing, and we're going to try and minimize that. This is a homogeneous equilibrium reaction, so what we're going to do is all of these components are going to be present at the same time. We're going to heat it up to temperature, and then we're going to measure after 30 minutes the amount of ether that's formed.
What we're going to do is in a normal traditional setup, I'm just going to come out of JMP, is in a normal mixture design, which is one I've prepared before, I'll just reload this.
In a traditional design, we would introduce the temperature as continuous factor. We would introduce our four mixture components as mixture components. Then we would use a chef cubic type model, so we'd add all the terms in to make that occur. That would suggest that we need around about 40 experimental runs to be able to create a data set large enough and to be able to analyze that data.
If we look at how that looks in a traditional ternary plot, we can see here the factor combinations of the different mixture factors and the experiments we would run over those 40, you can see it's exploring the space quite well.
The issue that we have potentially is that with the temperature, we're only really looking at high and low temperatures, and we're still at the extremes of the mixture factor settings. What I'm going to do now in the way we design this experiment is we're going to use a Space-filling Design instead, and we're going to use SVEM to analyze the results of that.
To build the space-filling design, I'm going to go into DoE and Special Purpose and Space-filling Design. I'm going to load in my factors which I've prepared earlier. I'm going to also load in my response. Okay, so I've got my ether as a response, which I'm trying to minimize. I've got temperature as a continuous factor. What the difference here is instead of introducing these four mixture components as mixture factors, I'm going to leave them as continuous. I'm going to still specify the range. What I'm going to do this time is I'm going to specify some constraints, so I'm going to load in my constraints.
What that says is that the sum total of the four mixture components must be less than 75% or 0.75 as a mole fraction. I'm going to put in the equivalent negative constraint as well just to make the maths work in the background. When that's done, I can then… I'm not restricted by a polynomial model in terms of the design space and how that's followed through. I can specify how many runs I want.
In this case, I've only got enough time to do 25 runs. I'm going to select 25 runs, and then we're going to make the design. We have a 25-run design here to a lot of decimal places, which isn't going to be possible, but we're going to target these in our experiments.
If we look at the results of this versus what we did before in the ternary plot, we can see very similar. We're covering a lot of the space, but instead of being at the edges, there's a lot more in the middle. Again, in our a multivariate plot, we can see the temperature now we're covering a lot more of the middle of the space rather than the edges, versus the original traditional mixture, Scheffe Cubic type design, and we're doing this in 15 fewer runs as well.
The aim of this really is once we've collected this data, we can then apply a machine learning type neural network algorithm to it instead of a traditional polynomial model and hopefully increase the resolution, and the understanding that we get out of this system. I will head back to the poster.
I've already said before, the experiments, there's going to be 25 of them. We actually made this 26, so we included a repeat. The red dot that you now see in these plots is the repeat test, and that was just to ensure that there was good reproducibility in our measurement of the ether.
As I said before, the mixture factors have been treated as continuous. The mixture sum is 75%. We're carrying out these experiments in mini-order claves. We're going to leave them for 30 minutes… we're going to get them up to temperature, leave them for 30 minutes, and then use analytical technique called gas chromatography to measure the amount of ether after that 30 minutes, and that's going to be our response for these 26 experiments.
We carried on. We did those experiments. It didn't take too long. We then did some slight modifications to the data set. What I've done before up to this point, is I've talked about these factors in terms of percentages. It's easier to work with later on if we transform this to a mole fraction. Essentially, I've just divided the percentage by 100, so we get a number that sums up to 0.75 instead of to 75%.
The second problem is because, and I'll come out of jump again here, I'm going to my results is these are our components, and in some cases, the sum total adds up to more than one or more than 0.75, more than one if we had the given concentration of the acid which is fixed. The reason for that is because these factors have been measured as part of the experiment. We can't fully achieve what we wanted to achieve.
What I've done in this case is a bit of a manual tweak. I've taken the largest component in the mixture, which happens to be the ester, and I've just adjusted that, so you can see here. I've adjusted by 0.01 just so that everything sums to one, and that just helps the maths work in the background. For the 26 experiments in the data set, that wasn't a big issue to do.
Some small adjustments, we also confirmed that the repeat run gave us a result within experimental error in terms of the ether concentration. All good to go and progress. We then started looking at the modeling of the data set.
As I've mentioned before, the reason we use a Spaceful and design over a traditional Scheffe Cubic polynomial one is potentially fewer results in the test, and we can apply some more neural networks to it and hopefully increase the resolution, not be restricted by quadratic or cubic terms, which are very limited functions.
The way we can do this is these neural networks are generally more applied to really large data sets. You need a lot of data. You can't really apply them to small DoE type data sets. That's because every run in a DoE is important. You can't afford to discount certain runs as part of a validation or a test set because every run in the DoE counts, you violate the design structure. Whereas if you use SVEM, and I don't have time, unfortunately, to go into the background of SVEM. I'm just going to show how you apply it.
But a little bit of background is that it works using SVEM, self-validation ensemble modeling. The self-validation part is normally when you're fit in a neural network, you would divide your data into a training set, a validation set, and a testing set, so you will partition your data.
Within self-validation, what we're going to do is we're going to replicate the data set, and we're going to use something called paired fractionally weighted bootstrapping with a gamma distribution. Effectively, you get anti-correlated pairs of each data point, one with a high weight, one with a low weight, one is used in the training set, one is used in the validation set, and you build your neural network using that.
Then where the ensemble modeling comes in is you build lots of those models using different weightings in the gamma distribution. Then you average the final model. That's effectively how SVEM works. It gives you, in theory, give you a nice model with high resolution than a traditional polynomial fit. It can be really applicable for mixture designs in particular. That's what we did.
This is our resulting SVEM model. It was a neural network algorithm. I used 50 models which were bootstrapped and average those. It was quite a simple neural network. There was only one layer with three hyperbolic tangent functions. Again, I'm going to come out of presentation and just go into JMP and show you how this works.
This was all done, the SVEM, via an add-in made by a company called Predictum. This is a licensed add-in. We're going to build a neural network. What we're going to do is I'm going to select my factors, which is the temperature, the acid catalyst, the alcohol, the water, and the ester, which I've transformed by the adjustment of 0.01 to make everything sum up. I'm going to select my ether as my response. I'm going to click Run, and I'm going to end up with a dialog to launch the SVEM.
I can select how many models are going to be averaged, how many bootstrap models. I'm just going to leave it at 50. I'm going to leave it fairly simple and have three 10-age layers… three 10-age functions in the first layer.
You can modify these as much as you like and bring in linear and Gaussian functions. I'm just going to leave it as we are. It's going to go ahead and run the SVEM, and this will be different every time you do it because the fractional bootstrapping will be different each time. We should get something similar to what I had before.
Here is my actual by-predicted plot, where this is the actual value of my ether response, and this is the predicted value by the SVEM. You can see I've got a really fantastic R² of 0.99 with quite a low error associated with it. I would then save my columns to the table, which I've already done. This is here, just to show you what this formula looks like versus a standard formula.
Effectively, the ether concentration or response is a combination of these tan h functions multiplied by coefficients, multiplied by our different mixture and process factors. This here is one model, this is the second model, this is the third model, and so on. Down to a total of 50 models, and then all of those models are averaged, and that gives us our prediction formula. That's effectively how SVEM works, and gives us that really nice prediction.
I'm all back to the poster. By way of comparison, even though it wasn't particularly designed for it, what I did was I built a least square regression model using the Scheffe Cubic terms in that model, and use stepwise parameter selection to narrow those terms down into the model. Built that model and compared it directly against the SVEM model.
Here you can see the model comparison, so the SVEM predictions are in red. The least square regression predictions are in blue. Higher R² value for the SVEM model, much lower error as well associated with it. Also on the residuals plot, again, same colors the least squared regression model has much higher residuals for certain data points. The SVEM producing a better model with a 25 run data set effectively. Normally it's very, very difficult to apply neural networks to a DoE. I'm just going to go back.
Looking at how that impacts the overall purpose of this work, which was to minimize the ether formation, and we can do that by looking at the prediction Profiler. Just exported that prediction formula into prediction Profiler. Again, I'm comparing the SVEM model versus the least squared regression model SVEM in colored in red at the top.
What you can see here is there are differences between the two. We do have much higher resolution on the SVEM model, whereas the least squared regression is limited by polynomial curves. Essentially for low amount of ether byproduct, we need to end up with a low temperature reaction, low acid catalyst in the mix, mid-range alcohol, high level of water, and a low level of ester.
There's some surprising results there for us, but we're fairly confident after looking at that actual biopredicted plot that we've modeled the system quite well, and you can see differences between the two models. For example, here, the acid catalyst hasn't been picked out as an important factor in the least square's regression model, and the direction of the trend versus ester is completely different. That explains the differences in the R² value and the predictions of the least squared regression.
You can also see some nice features here, for example, on the alcohol where we've got a dip in the middle versus just a standard polynomial who doesn't really pick it up as much for the least square's regression.
To summarize, instead of a traditional mixture-type design, we've used a space-filling design of experiments. We've treated the mixture components or the mixture factors as continuous factors with constraints built-in, and that's how we've accounted for the sum total of the mixture.
We applied SVEM as a modeling technique to maximize the information we've got from a very small data set. We've increased the resolution of that prediction versus a traditional polynomial type model, and that's really helped us to understand the conditions to minimize ether formation in this chemical reaction. It's also accelerated the time for us to start building a kinetic model for this whole system.
In terms of future work, we do also have time series data from these experiments. Instead of just having the data point at 30 minutes, we also have data points at 5, 10, 15, 20. We can do a bit more processing and try and integrate that to build and that to build, to calculate actual chemical rates and build a more developed kinetic model. That's where our attention is going to focus in the future.
Also, many thanks to a lot of the scientists and engineers at Johnson Matthey on the technology team who contribute to this work. Finally, to Predictum, who provided training in the use of SVEM and also for its licensed use of the add-in. Thank you for listening to this poster presentation.