Hello all. I'm David Paige, I'm the Digital Champion of the Global Business Unit of Soda Ash & Derivatives at Solvay . Together with me, we have Carlos Perez, who is Industrial Data Scientist at Solvay at corporate level, who will be co- presenter of this presentation.
This presentation is about the scaling up of the use of machine learning techniques in the chemical process and concretely at Solvay. Here in this slide, we have the agenda of this presentation. First of all, I will share with you a brief introduction of our multinational company, Solvay. T hen some words also about our general business unit, Soda Ash & Derivatives.
Then here in the point number three, we will enter in discussion about how machine learning techniques helps improve our production process. Then we will go a little bit deeper about the usage of JMP in our GBU. I will explain to you the awareness sessions and the training that we provided to our population of engineers. A lso, we will see a couple of practical use cases.
Then my colleague Carlos will share with you one add- in that they developed internally at corporate level at Solvay, which is very useful for us, for the final users to connect to our main source of data, which is the MES, manufacturing execution systems. Finally, I will share with you the main challenges that we faced during this journey and also the lessons learned.
Brief introduction of our group, Solvay. We are a science company founded in 1863 whose technologies bring benefits to many aspects of daily life. Our innovative solutions contribute to a safer, cleaner, and more sustainable product found in homes, food, and consumer goods, planes, cars, batteries, smart devices, health care applications, and water and air purification systems.
Very important, our group seeks to create sustainable shared value for all. Notably, through its Solvay One Planet program, we have three pillars: protecting the climate, preserving natural resources, and fostering better life.
Here at the bottom of the slides, you can see some key figures of the group in 2021. As you can see, we have a little bit more of 21,000 employees all over the world. We have presence in 63 countries and we have 98 industrial manufacturing sites.
Now, if we jump to our general business unit of Soda Ash & Derivatives, which is the business unit that I work for, coordinating the implementation of what we call the digital transformation initiatives. As you can see, we have 11 production sites distributed around the world. We have six production sites here in Europe, but also we have two production sites in North America and one production site in Asia out of other locations around the world like warehousing and buildings.
Also, we have three R&I centers located in Brussels, in Dombasle, which is our manufacturing site in France, and in Torrelavega in the north of Spain. Globally, we are 3,200 employees all over the world.
Our products. These are the two main products that we produce in our general business unit. We produce soda ash and we produce sodium bicarbonate. The soda ash is mainly used for the glass manufacturing production, different types of glass for building, but also for photovoltaics panels or for containers, as you can see here, the example of the bottles.
Also, soda ash is used to produce detergents and very important and very new with the making of the [inaudible 00:04:44] of electrification that we are seeing around the world. Soda ash is also used for the production of helium for the batteries.
Bicarbonate. Our sodium bicarbonate, it's used for different markets. First one, for the exhaust gas industry cleaning, which is our SOLVAir market. A lso a very new application for the same purpose, gas cleaning, but for the ships. The sodium bicarbonate, it's also used for the pharmaceutical industry and for the food industry.
In this slide, I would like just to show you about the complexity of our production processes to manufacture our final products. As we said, our final products can be soda ash, light or dense, and also the refined bicarbonate.
To produce them, we need to consume different raw materials such as the limestone, the brine, and also we use the coke and anthracite in our lime fields. As you can see, the the production process is quite complex because we are using different assets like the absorbers, like distillation in the distillation sector, dissolvers , precipitation columns, filters, compressors.
We have a long variety of assets used in the manufacturing process and very complex chemical reactions, mixing gasses, liquids, solids. We need to take into account thousands of parameters in terms of temperature, pressure, flows, and so on. It's very important the use of advanced analytics and machine learning techniques in order to improve this production process.
Here we are entering in the chapter for how machine learning can help to improve our production process. First of all, to share with you our strategy. Clearly, in the soda ash and bicarbonate, our strategy is to be competitive and keep our leadership worldwide position in this global commodity market, which is soda ash, but also in the premium market as the buyer.
What is our objective to reach this ambition in our strategy? Our objective is to reduce as much as possible the variable and fixed cost in our manufacturing sites while ensuring the overall equipment efficiencies, so the OEE, and the quality of our products.
Let me put some examples of how we can impact in our variable cost and fixed cost. In the variable cost side, clearly, one of the levers that we can improve is the yield. If we are able to increase our sodium precipitation yield in our carbonation sector, clearly what we are going to do is to reduce the need of raw material and energy in our production process to produce the same quantity of soda ash.
The same for the topic related with energy efficiency. In the previous slide, I showed you the complexity of the production process and the energy that we need to use in the different sector such as the distillation sector, calcination sector, or lime kilns. If we are able to improve the main parameters on these sectors, we will be able to reduce the specific consumption of energy in our production process.
In terms of fixed costs, one of our main fixed cost in our production process is the maintenance cost. We have unplanned events, unplanned mechanical breakdowns in our industrial assets, and also we perform regular maintenance activity and cleaning of our assets. If with these machine learning techniques we are able to anticipate, to predict these unplanned events, we could also potentially reduce our fixed costs.
Our idea, our ambition is to combine the deep expertise that our process and control engineers have on the domain, the soda ash production process, together with the IT and computer science skills and math and statistics skills. This is our ambition.
Traditional method. Traditionally, what our engineers is doing is to use the inputs of our process using a theoretical model. For example, the thermodynamics or the chemistry in order to understand the process and to get an output. This is the traditional method. But now with the machine learning techniques, what we can benefit for is about the historical data.
In our site, as I explained before in this very complex environment of the production of soda ash, we store thousands of different sensors data in our MES systems, in the manufacturing execution systems. Data from temperature, flows, pressure in different parts of the process. We have this historical of data, so we can provide with the machine learning algorithms, inputs and outputs of our process. Creating machine learning, big data models that could help us to improve our process and to understand better our process for the future inputs.
Now, here in this slide, just to share some publications, also from Dow, another important multinational chemical company, that is sharing with us here that a chemical company must invest to create a critical mass of chemical engineers with technical skills in statistics, mathematics, modeling, optimization, process control, visualization, simulation, and programming.
But it's much easier to train chemical engineers on data analytics topics, rather than to train data scientists on chemical engineering topics. We completely agree on this statement, and this is what we want to do at Solvay.
We have a lot of very skilled chemical engineers, and we want to train them with this advanced analytics techniques. T his is the main reason why we launched the program of Machine Learning Techniques with JMP in our GBU, our Global Business Unit.
It was a program that started in 2021. The target population of this program was 47 engineers in our GBU. I t was led by the industrial data science team at corporate level.
What was the content of this program? We had one- hour session, first of all, one day, where we explained why we want to use the machine learning techniques with JMP in order to improve our production processes, as I explained you before.
Then during seven days, we made an individual online course for each of the 47 engineers related to a statistical thinking part. It was just an introduction of a statistical thinking methodology. Then we enter it on the JMP introduction part, explaining the tool, explaining the main feature, the benefits of using JMP, and the main tips to start creating some graphics, some statistical reports, and some basic things. We combine theoretical lessons with practical exercises and planarizations of the web.
Then during 15 days, we enter it in more details about what we can do related with machine learning techniques, which we did the same individual online course and also practical exercises and plenary session.
All of this program training last, let's say, around one month. But then the most important part was the selection of the real cases to solve in the different manufacturing sites for the different participants of the training. We made this selection and we provide a license of JMP, of course, and a regular support with weekly, linear meetings and individual coaching.
Let me put two examples of practical use cases about this selection that we did afterwards this training session. The first one is about increasing the sodium precipitation yield in the production processes of Rheinberg, our manufacturing site in Germany, and Torrelavega, our site in Spain.
How we use the JMP on this project? First of all, to screen the multiple variables that we think that can impact our main target variable, which is the sodium precipitation yield, in order to explain the variability of this target.
The goal was to investigate what are the variables that can explain better the variability of our target. For this, we use one of the tools that we're learning during this one month course, which is the predictor screen.
This is very important, because, as I explained to you before, we have hundreds of variables impacting this output, which is the yield in the process, s o it's very difficult to analyze one by one. This tool allow us to, in a very quick way, in a very fast way and intuitive way, to understand what are the main contributors explaining the variability that we have in our tool.
Also, we need to say that JMP is a very intuitive and code- free advanced analytics tool, and this is very important because the production engineers, not all of them have the knowledge to use these programming code tools. Also, important to say that to visualize the long- term variability of the target, but also its relationship between the most important variables is a very important feature that JMP has.
Finally, also, we use JMP to elaborate the statistical reports about the performance of the different approaches, different trials that we perform along the project. This is the first example where we use JMP in order to understand better our process and improve our yield in both of our management designs.
Second one, in this case, we are talking about finding the root causes for the variability of one important parameter of the final product, which is the carbonate content in the sodium bicarbonate.
Here we use JMP similarly, like in the project that I explained before, we screen the multiple variables and select the most important ones to explain the variability of this target. For this, of course, we use to use it again, the predictor is giving input, as you can see here in the right- hand side of the slide.
Also here, it was very, very important, the visualization in a graphical way, the interaction of the main variables that we identified thanks to the predictor screening. You need to understand that on this type of projects, we need to collaborate with different stakeholders. The production engineers cannot solve this type of very complex projects alone.
Here we need to align and speak and generate debates with the production operators on the field, with production operators in the control room, with site managers, other engineers in other plans, experts at corporate level, and so on. It's very important to translate what we analyzed in analytical way, in a graphical way to make and generate these debates.
Finally, also very important to help on the decision- making process in order to, at the end, of course, taking decision on these main variables that we demonstrated in an objective way to the people that at the end decides to make a modification in the process to take these decisions.
This is what we have done also in this project. At the end, it's about make a modification, make a small investment to modify a part of our installation in order to reduce the variability of this carbon content of the final product.
That's all for my side for the moment. Now, I will give the floor to my colleague, Carlos, data scientist at the corporate level, who will explain to you an add- in that we developed internally at Solvay that allow us to connect the data from our MES, which is the player, as I explained before, where we store all the data into JMP.
Thank you, David. I will go ahead and share my screen. Can you all see?
Yes.
Okay, I want to get started. Do you see this ribbon or it's only me?
Yes, it's visible .
Yes, thank you, David. In this section of the presentation, I will show you one tool, one add- in to demonstrate, an open source add-in that we have created in the team of industrial data scientists at corporate level in Solvay. T his is a team that supports all of the global business units. That means that we have to provide for solutions for all of the different MES that exists in Solvay.
We did the automation of this task because we saw the situation that was happening before where we had to download the data in a spreadsheet sometimes without a lot of advanced capabilities. Then we had to import this data, treat it, and then finally be able to use it without...
Sometimes it was not even clearly identified because it was only the name of the sensor, but sometimes the name of the sensor is not very clear, because it's not very well standardized, the notation.
To leverage the power of data, we say, "Okay, let's make the process of extracting the data as automated as possible so that all the process engineers can use it." We have leveraged this power in JMP, in GBU soda ash, and also in other GBUs with an add-in that is able to connect to the two most common data bases in Solvay that are the MES historians, IP21 and PI from Aztec and from AVEVA, respectively.
This add-in connects directly to the databases if we are in the local network and is able to fetch with a query whatever information is stored there. We have automated the task of connecting to the server, selecting the query parameter, downloading the sensor data table in a regular format with the description and units. A lso, we have integrated other functions.
It's also worth mentioning that in this case, we are dealing with a lot of sites. The sites of soda a sh are among these, of course, where we use, as I mentioned, two main databases for historian MES, and where this is more or less the range of sensors that we have to take into account.
This is how it looks today the add-in which is available from the menu add-ins. It was developed from an application menu, and it has also some script on the background. I will show a clear demo with this, so bear with me.
It's a focus on process engineers. As I mentioned, it integrates a description and engineering unit, which is very useful for identifying what data you are using in your analysis.
A t the end of the day you get two data tables. One of them, so what I see, what you have with typical statistics, and the other one, which is a time series with all the details of every sensor according to your extraction. But above that, the functions that I mentioned.
I will go out of this script to be able to show you the demo of this tool. Just bear with me. This is JMP, as you know it. Here we have an add-in that when it is connected to the database, you are able to see all the list of servers here in the list.
It is connected to a server list that is maintained by another team, which is an IT team. In this server list, it is also possible to modify the details in case one of the servers is not available. For example, one can put the IP address and domain here to add a new server that is connected to the internal network of Solvay.
After that, after selecting your server, the next step is to go ahead and filter your sensor by name or description. This is important because as we mentioned, we have in the order of thousands of sensors, which means that if you are going to go ahead and try to see everything that is available, it might take a long time or the server might crash.
For that reason, we have this filter so that you can see what's relevant to you in case you want to see flow sensors, temperature sensors, pressure sensor, or you want to look the three sensors by description or both.
After you are done with this filter, what you need to do is select the relevant tags from this other list which are given in the presentation, just an example. But in this case, I'm not connected to a local network, but you can see an example here.
You will see here all the list of available sensors, and what you need to do is to add them to right -hand side, which is in this way. The right- hand side list means that these sensors are ready to be extracted for you. Now what you need to choose is the start time and end time for your extraction.
You select that, it perhaps is one day, one month, one year, I don't know. Then you have to choose what type of method you want. The most common is interpolated because it means that you will have evenly spaced data by minute, by second, by hour or by day. Also, we offer an aggregation that is in this case is the average. A lso, we offer to extract the actual data as it was recorded by the sensor.
One more thing, if for some reason you already know the list of sensors that you want to download, and you don't want to browse by name or description, you can directly paste also this list from a CSV format that you have available.
When you have all these parameters ready, what you have to do is heat from extraction. T his will take the time it takes the SQL query to go to IP 21 or PI. W hen it finishes, you will get two tables with what I mentioned before. One with summary, which will allow you to understand the typical statistically values for each sensor, row by row.
In this case, you have the name of the sensor, description units, and also the mean, standard deviation, max, mean range of the sensor. In this way, you can understand if the sensor is perhaps not working or something odd is going on so that you don't need to extract.
Furthermore, you will also get the time series data, which in this case looks like this. You get a column with the time stamp, then also one column by sensor with the proper format. For example, this is a continuous amount, this is a discrete amount, and everything is properly formatted. On top of that, you get the description of the sensor as I mentioned, and the units, which is very useful for processing units. This allows you already to apply all the methods from JMP.
Here you have an automated version of the add-in that allows you to extract data directly to JMP. It's also open source, so if you are interested in contributing, you can go to the community in JMP or in GitHub and contribute your own developments.
On top of that, we also offer three functionalities, which are the update table. The update table will make sure that when you are done extracting your data and you perform one analysis, you can keep updating the same analysis the next day.
For example, let's say, yesterday I downloaded this data and I created one column for calculating some value. L et's say that today I want to also see how this calculated value is. That means I just have to hit this button and this data will be updated with the newest data from yesterday to today.
Also, we offer a refresh functionality in which it's meant to work as some dashboard. This means that it will fix time window and you will be able to see your analysis with respect to the current time. That means that if I perform an analysis yesterday and I have a new column with a new formula, I can hit this button and only see the relevant data table for the actual period for getting the past.
That means that as I said, some fix window is fixed, and then you can see the same analysis with the current time instead of with the old time. One of them will only see one time window and the other one will update the full time window.
Furthermore, you also have the add new tags, which means that for some reason you forgot to add a tag and you are remembering that it's very important, you can add a new functionality.
With all this said, I will go to the next slide. That means that by this point you have already a nice data table in JMP, which with all these functionalities that we mentioned, update table, refresh table and add new tags. This allows you already to use the typical methods for advanced analytics in JMP. For example, this one I am showing here both the JMP and JMP Pro version, but this is up to you.
We also empower the user to use another add-in that we also developed that is called Predictor Explainer, which will be presented in another Discovery talk. We also have other types of analysis. This will allow us to perform the typical task in data analytics, which could be root cause analysis, anomaly detection, process optimization, and others.
With this, I will let David to conclude on the presentation.
Yes, thank you very much, Carlos. I don't know if you are seeing now my screen. If you stop sharing, maybe.
Yes, stop share .
Okay. Good. Let me reshare the screen.
Yeah.
Okay. Could you see now my screen?
Yeah.
Perfect. Thanks, Carlos. Thanks for your support that you provided to our GBU, not only developing this add-in, but also coaching our production process engineers on JMP, too .
Last slide to share with you what are the main challenges that we faced during this journey of scaling up the usage of JMP in our GBU, and also the lessons learned and the next steps.
Today, let's say, that around 20 % of the target population that two years ago we started with this program, is continue using JMP today in a routine basis. The main blocking points that we found are, of course, resistance of change. Some people are more comfortable using another tools like Minitab or only Excel files. In any project or initiative that requires a change in a tool, there is always this resistance of change that requires time and efforts to change.
But another reason is also the lack of time. Lack of time that is linked also to priorities. The priorities of the role of production and process engineers is not always fully oriented on process optimization only, because sometimes there are too much reporting to do and other topics to cover in their role.
The main points to keep during this process are this type of awareness that we did with the practical industry success for example. This is very, very important. In order to convince the people and to show the value to use machine learning techniques to improve our process and to reach this competitive level that we want as a company, as a business, we need to use these practical industry successful, for examples.
Because this is related with a population of chemical engineers that they will not understand if we start to talk about different examples in marketing, in finance, to improve all of these other areas with which we need to show them clear and concrete examples related to process industry.
Then also an important point about the importance of this predictor screening tool as a kit tool for us for the variability sourcing. The main problem that we have, as I explained it before, is the variability that we have is certain parameters of our process that we need to reduce.
If we are able to reduce this variability of the key parameters, we are going to really reduce our variable and fixed cost in our production manufacturing sites. This tool is very important for our production engineers to find the root causes of this variability and act on them.
Also an important thing is this combination that we did between planarization, so all together sharing thoughts and experiences, but also with individual practice. P rovide time to the people to practice by their own and then exchange in a common call.
Finally, the points that we identified to reinforce and to implement in the near future are, first of all, of course, in order to tackle this problem of resistance of change, we need to convince the site management about the importance of analytics for the production and process engineers. We need to launch a series of awareness sessions dedicated for them. This is in the item we are going to do a lot of long this year.
Also very important for us, we identified this strong individual coaching for the production and process engineers when they start to use JMP in the real cases, in the real projects. Because JMP requires time, the different tools as per the tool screening and other tools requires time. It's very important for the very first projects that one engineer developed using JMP to have a good coach, a good trainer to have a company during the process.
That's all from our side. Thanks a lot for your attention. If you have any questions for me or for Carlos, we are available. Thanks a lot.