Industrial Data Science for Batch Manufacturing Processes (2023-EU-30MP-1293)

Thanks, everybody. I'm Mattia Vallerio, Advanced Process Control at Solvay s ite in Italy in Spinetta Marengo. Today, I'm here to present a work that we did together with the University of L euven on the use of the analysis of industrial batch data. In more specific, I will present a JMP plugin that we developed that is using autoML to do feature screening, and then I will be moving on to use functional prediction Explainer to analyze batch data.

T he idea is on one side, the autoML will be used for automated screening of relevant parameters, and on the other side, the functional principle component. The idea there is to use it for anomaly detection on batch manufacturing processes.

While doing that, I will also talk about the need of align data time wise to be able to properly analyze it and why you need to do it and how you could do it in a simple way. Just for reference, this is a work that has been published in a book, but it's also available in archive. T his is the reference with all the authors listed there and you can download it for free. Feel free to have a look at it and you will find some more details on what I will talk about today.

In the same way, the plugin that I will present is both freely available in GitHub, but also on the JMP community page as well in the material for this talk and also on a dedicated page that is also called predictor-e xplainer.

M oving back to the talk today, the data that we use is based on a use case that was published by Salvador Munoz back in 2003. There, you can download this code where he's using PCA and PLS method to analyze batch data. T here, the use case contained within is also used in this talk as well and in the publication that I showed just before.

I f we look at the data that we are analyzing, basically it is drying process. This drying process is composed of three different phases: the deagglomeration phase, the heating phase, and the cooling phase which you can see here as phase 1, 2, and 3. T he purpose of this process is fairly simple, is just to remove solvent from the dry cake for the material that has been introduced into this drying unit.

A s you can see, we have different initial cake weight that is introduced into the system, and there are different variation according because the starting material is different every time. T he purpose is to reach specific target concentration for the solvent at the end so it doesn't have to be too dry or too wet at the end of the phase. Y ou can clearly see already from this picture that we have some variation in shape and time duration of the temperature profile and therefore also of the process itself.

If we go a bit further in analyzing the data, then you can see that we have a variety of different lengths of batch duration. This is the color on the right side on the legend. You can clearly see here even more clearly than before that there are different shapes. T his is also true for the solvent concentration. As you can see already, this shouldn't be too much of a shocker for anybody that is in the process industry. T he longer the batch, the lower the solvent concentration, the shorter the batch, the higher the final length concentration more or less with some few exceptions.

But as you can see, the length is all over the place and also the main phases, they are not aligned. If you would take now data for all these batches and start to analyze it, you will be comparing samples. For example, at this point in time you will be comparing data from the de agglomeration phase with data from the heating phase, or even from the cooling phase with the deagglomeration phase. O f course, this is not what we would like to do.

That's why it's important before you do anything else with the data, it's important that you actually squeeze or shrink or enlarge data. But in order to have all the different batch have the same length.

Y ou can do this in different ways. This is technically called dynamic time warping. This is also a feature that is included into JMP when you do functional data exploration. But there are different ways to do this. You have very complex mechanisms and algorithms that have been developed during the years. T he reference for these methods you can find in the publication that I just showed you.

But the drawback for this… O ne of the drawbacks of the advanced methodology is that you need a reference trajectory in order to be able to use most of the dynamic time-warping algorithm.

There are other ways that you could use to synchronize the batches. One is that if you have a monotonous- increasing latent variable, most of the time, this is the conversion or the total amount of material that is fed inside the reactor, so t he cumulative feed in the reactor. This can be used as a way to plot the data in a system in a standardized way and to have all the data aligned.

The methodology that we used for this use case for this talk and also that we are proposing in the paper, in the article that we wrote is to normalize the data based on the automation triggers. B y automation triggers we mean the change in the different phases.

E very beginning and end of the phase is then normalized between 0 and 1, as you can see here. T he deagglomeration phase starts from 1 and goes to 2, and the each phase goes from 2 to 3, and the cool down phase goes from 3 to 4. Then all the data is squeezed or stretched to fit into this bucket. T hen something very nice happens that you can directly see abnormality or abnormal batches in a more clear way than what you would have done on the left side.

Then if you would look in the plot of the phase time, so the one in the middle, then you can clearly see that the inclination of the line basically tells you how long the batch lasted. T he more steep the line, the longer the phase that we are currently looking at.

The drawback of this methodology is that it cannot be applied online, of course. T his can only be applied once the batch is finished or once the phase is finished. But of course, online, it's basically impossible to know when this is going to end. Y ou therefore need to resort to other kind of alignment procedure like dynamic time- warping that is described in the paper. I won't be touching that today. T hat's it for this.

How do you analyze actually batch data? There are different ways to do that. T he first way that we are looking at is by using fingerprints. W hat do we call fingerprints? Basically, fingerprints, y ou can define it as aggregated or a statistical summary of different summary statistics of the data that have physical meaning or engineering value. T hese are normally the variables that your engineer look at to know if the batch is going correctly or has been performing well or not. If you ask your experts in the field or in the process, they will have this kind of KPI that they are monitoring to know if a batch has been performing well or not.

F or example, one of that could be the maximum level of the tank in the deagglomeration phase or the maximum temperature in the drying phase or the standard deviation between the set point and the measured variable during the drying phase or… I don't know. You name it. You can go as crazy as you want, and you can build basically as many features as you want starting from the data that you have. T his is a way to remove the burden of the transient behavior of batches, and it's a way to actually compare between batches by using simple statistics to compare different features of the batch.

The problem with this is that you can imagine that you can end up with a lot of different statistics that you have to track and monitor, and sometimes it's very difficult to understand which ones are really relevant or which are not relevant at all. T herefore, that's why we developed this plugin that I showed you just before which uses autoML to basically do a feature selection on all these fingerprints that you can create yourself.

The add-in can be installed by everybody on JMP. I t basically looks like this. I t looks like any normal menu that you would have in JMP. I t requires you to install a Python installation that also is automatically managed by the installer of this plugin as well.

I f you want to do it, let's say, let's try to use it. W e want to model the final concentration of the solvent, which is our Y. You can just basically pop in all the sensor data that you have, and it will automatically create all the different feature engineering. We'll take the maximum, the minimum, the standard deviation, the median, the mean, and all the statistics you possibly imagine of all the variables that we introduce. If you have information on the batch ID and the phase ID, then you can just plug it in.

Additionally, if you have the Python installation, then you can ask the tool to do a SHAP plot for the SHAP value of the different features to get a better understanding of what the boosted tree is doing behind the scenes to actually do the magic and use this and select the features that are relevant or not.

If I just click on… T hen you can tweak your number of trees and signal-t o- noise ratio and you can do whatever you want. You can even add weights. You can choose. I f we click on OK, then the magic happens. Now, as you see, it's still computing because this is the Python script. Behind, it is computing the SHAP values, so it might take some time before we get the results.

But let me see if I can… Yeah, I can move here already. T his is the result basically. Y ou can see that we have the different… A s I said, the tool basically generates a lot of different statistical aggregation of the data. You have the standard deviation of the agitator speed, standard deviation of the torque, mean of the agitator speed. Y ou can see it for yourself.

T hen we also have… Oops, s till computing. That's the beauty of doing it live. Sometimes it doesn't go as planned.

Here it is. This is the SHAP plot and we're going to look at it later. Let's do it again, but without the SHAP value request just because I want to show you another feature. I won't click on that. I won't be doing it anymore. Now… Oh, the SHAP [inaudible 00:16:57]. I'll do it again afterwards, but let's move on with this.

As you can see, we also have random and uniform noise with statistical feature of this. T his is being introduced as a way of cutting off, as a way of selecting which features are really relevant and which features are not relevant at all or cannot be distinguished by noise, actually. This is standardly built in as well. I t gives you an automatic cut off as well.

O ne of the things that you can see is that he selected the torque and the agitator speed as one of the interesting variables to look at. Now, you cannot really use this as… A ctually, if you think about it, it's quite understandable, because depending on the amount of the wetness of the cake that you introduce, then of course the torque consumed by the agitator will be higher or less or lower depending if it has to work more or less. I t's completely normal that at the beginning, the ones that are a bit wetter, they might be less resistant, and the ones that are less wet will have a little bit more resistance.

But this is the kind of feedback that you get from the tool. The standard output is a plot with the variables that are more relevant. You might have seen passing by it. It also makes a parallel coordinate plot of the output as well with the color on the target. Here, you can see again that if the torque is a bit higher, then the final concentration of the solvent is also a bit higher, and if the torque goes down, then the wetness of the cake at the end is lower. T he agitator speed is basically the effect of the torque as well. T his is also the torque. But this is just a visual representation of what the tool does as we were seeing before, but it's not there anymore. I don't see it.

Y ou can also have SHAP values to look at the data in a different way. T he SHAP values, if you're not familiar with the term, is a way of visualizing the impact or the effect of the different values on the target. It's a way to explain actually the result of the machine- learning algorithm that you use behind.

L et's try to do it maybe one more time, maybe selecting fewer parameters, let's say, torque, agitator, and dryer temperature set point, which are the ones that have been [inaudible 00:20:48]. Okay, like that. Then we do with phas e and batch. T hen we ask him to do the SHAP plot. Let's see. Coming. B ear with me a bit with this. It should be any minute, any second now. Oh, here we go.

The legend gives you the normalization of the value. I f the points is on the left side, then you have a negative effect on the target value, and if your point is on the right side of this, then you have a positive effect on the target value.

Now, as you can see again, the torque is one of the most important ones. Y ou will basically see the same that we have seen in the parallel coordinates plot and in the results of the analysis as well. If you have a lower value of the torque, then you have a negative effect or you are on the left side of the curve, and if you have a positive value of the torque or higher value of the torque, then you have a positive effect. T hen you can analyze this for the other variables as well.

But this is a very powerful… W e think it's a very powerful tool to visualize the effect and to break down, to analyze what the actual algorithm spits back to you in a more efficient way. At least this is what we think and that's why we included it into this tool.

T hen you can just scroll down and look at all of the variables. Then of course, we still have the random uniform and the noise as well inside, even though it's not really relevant.

You might have noticed that batch ID was also relevant as well. It's a bit fishy, this plot, right? This actually is a good point to move into the next part of the talk. That is to have anomaly detection for batches or to have a way of analyzing if one of the variable is going out of spec.

T he standard way to do this for batches or for industry in general, is to look at some KPIs and see if they evolve during time. F or example, we might want to look at… No, that's not what they I wanted to open. We might want to look at the different phases and the duration, to have a look at the variation that we see. W e expect a lot of variation in the deagglomeration phase and a little bit less in the heat phase and the cool down.

T he other way to look at this is basically to do a control chart of different parameters and see if these parameters are inside the limits that you have specified or not. O ne way to look at that is to look at the target function, for example, that will be one of the first variables that you need to monitor.

N ow, if you remember the graph that we showed before where you could see that the batch ID had an impact on the solvent. Now plotted like this, it makes more sense what we are looking at in that SHAP plot. It is because there has been definitely a trend. S tarting from batch 0 towards batch 70, there is a variation on where the final solvent concentration has been. U p to batch 30, we were on target, then we went under target, and then we went too much, high solvent as well. Definitely something changed during the process, and therefore, we had this kind of visualization in the SHAP plot as well. I t picks up that the batch ID is relevant to predict the final solvent concentration, but it's just an artifact of this data.

N ow, we don't know if this is different batches. Most likely, there are different batches of different product, and the initial concentration differed from different campaigns or something else was going on. But this is an additional uncertainty that is inherent of batch processes that if you have this variation in your raw materials. This is also true for other process as well, but for batch processes, it's much relevant as you can see here.

One way to look at data or to do anomaly detection that has been widely published and is also widely used in this industry is the use of PCA analysis like PCA and PLS combination to understand the multivariate space at the specific point in time. I f this is not representative of what is going on in the batch or in the ongoing batch, then you will have an alarm. I t's a multivariate way to look at the data.

Now, with the functional predictor explainer, now we can basically do the same, but instead of using standard PCA, we will use the entire information of the trend. This is a standard tool that you can find inside JMP. It is in specialized model. T his is Functional Data Explorer . It's a part of JMP Pr o, only JMP Pro. That's what I'm using. If you have it, then you can use it. We can do basically the same or we can do this analysis. I already run it so we'll just relaunch it.

Basically, what you see is, for example, if we're looking at the tank level as a function or as a variable, then it gives you summary statistics. T he idea behind the FPCE like PCA is instead of creating… It's basically creating and identifying eigen functions that can explain the shape that we see in a specific percentage. I n this case, for the tank level it just identified two eigen functions, and the sum of this function can explain 97.3% of the shape of the totality of the shapes that we see.

N ow, here you see all the shapes on the left, and you can clearly see that there are some that are not represented by… They're not similar to the rest. You can play around a bit and increase the number of shapes to include the third eigen function, but automatically, JMP selects for you the most appropriate number of eigen function to have a trade off of explanation of the shape. I f you go back to two… There we go.

How does this work? Basically, as you can see, you have all the batches here, and then you have the score plot which is actually what allows you to understand which batches are anomalous and which are not anomalous. Y ou have definitely batch 61 that is a bit out there with respect to the rest. Then as you can see here, you have batch 55. Going left from right, you can see that there is an evolution of the batches on the Component 1 axis, which was this specific shape over there.

A ccording to where you are on this C omponent 1 axis, then the batches will have different shapes, and the max level basically will increase and increase until you reach batch 55 and batch 66, which are a bit anomalous with respect to the rest. This basically is the same concept of having a PCA but with shape function analysis instead of multivariate analysis done row by row and point by point.

The idea in the end is that you could use this online to understand if the batch is inside the specification or outside of the specification. Y ou could do it per phase, for example. If you have a specific shape for one of the variables that you need to trend, then you could use this to analyze and see where you are.

The same is true for the other shapes that we have, the other variables. You can do this for the dryer temperature variable. In this, case we have three different eigen function, and this explains up to 87% of the variation. A gain, by looking at the score plot, you can spot anomalous batch basically just by looking at this, so batch 34 as a flat top while all the other batches have a pointy shape that you can find back in basically all the other ones.

The model that is coming out can be used for on line anomalies detection if you can implement it. B y the way, if you have the new version of JMP, you can connect directly to your process h istorian if you have OSI PI. Otherwise, there's been another talk by my colleague, Carlos, about the use of another plugin that we developed to extract data from your historian which can connect to both OSI PI and Aspen 21. You can download your data directly and plug, pop it in and see if a batch has been behaving according to your specification or not.

Basically, I think this more or less cover what I wanted to show, and the idea that we did the two different methodology that we have been using in Solvay to look at process data. Looking forward to see you at the summit in Spain next month in March if you are there. Otherwise, feel free to reach out to me or to any of my co author if you need more information.

Just as a break up, this is the place where you find the article that we published about this with a little bit more information and a little bit more detail with respect to what I've just shown to you. A gain, it's open source. You can download it for free from the link and it's all there for you to look at and browse. Thanks again for your attention. T hat's it.

Industrial Data Science for Batch Manufacturing Processes (2023-EU-30MP-1293)

Presenter