A Method To Strategically Pre-process Data From Industrial Processes Before Stor...

Hi, I'm Günes Pekmezci and my colleague, Luis Furtado. We are working at Bundesdruckerei, and we are both engineers in production department for the data team. Today, we would like to present you a method to strategically process data from industrial processes before analysis and storage.

I would like to, first of all, tell a little bit more about our company. Bundesdruckerei is a government- owned company that produces security documents and digital solutions. We are getting bigger and bigger every day. Right now, we have 3,500 employees. We continue to grow. These figures are from 2021.

In that year, we also had a sales margin of €774 million. We have over 4,200 patents. Most profits we are earning is coming from German ID systems, which I will talk about it a little bit more in further slides. Then we have also secure digitization solutions as a bigger profit bringer for us.

If we look at the target markets and our customers, we will see, like I said, the official ID documents first. This means that we are physically and digitally producing official identity documents like ID documents, passports, resident permits, and this is our biggest market.

Then we also have some security documents, which means that we are producing bank notes, postage stamps, text stamps, and pertinent security features for the government. On top of that, we have a growing department for eG overnment. Here we are creating solutions for the authorities, mostly German state authorities, to digitalize their public administration systems.

We also have high security solutions. In this department, we are creating higher security required solutions for the security authorities and organizations. We are also having a target market in the health industry. We are creating products here and also systems for secure and trusted digitalized health systems.

Other than that, we also are active in the finance field. Here we are creating products and systems to control and secure financial transactions, both in public and also enterprise sector, which also could be taxes, banks , insurance, et cetera.

If we come to our use cases, what we want to share with you today, we are going to tell you about a use case that we decided to implement for us, for predictive maintenance. Like every other company, our aim was to create some use cases for new digital area. We thought about what could be analyzing for big data, predictive maintenance, and things like that.

We decided also starting with our biggest document that the German passport . This document is very, very complex, and it has a lifetime for 10 years. We have a high production rate also here, and we decided to create a predictive maintenance use case for a process in this document.

Our process is punching process. It was a good process for us because we have a good understanding in this process and also which is very, very important in industrial of things that we had the access to the data that we could analyze to create our information.

Our objective for this use case was to create a better product quality by making a predictive maintenance for our tool wear state. Instead of having the tool worn out we react to it. We decided to look at the data and create an information that will allow us to plan our tool change time.

We can also minimize our downtime, minimize our scrape rates. We could also use this use case in different machines, use this as a long-term behavior of the process. It was a really good use case for us to start with.

I will give the rest to Luis to explain you further how we go into this use case and what we did exactly, what were our challenges, and how we find solutions for that.

Thank you, guys. I'm going to present a bit more about our product and process. In the case of product that we are analyzing this study is the passport.

The passport, when you think about this, is a book. It's like a sandwich full of pages, and those pages has also a lot of security features like the picture that is printed, the data that is lasered. There's also the chip, the antenna from the chip. There's also h olography layers. There's several features for security that is inside of the German passport.

When you make the sandwich, there's a lot of machines also to bring all those features to the product. When you make the sandwich, you need to cut it in the right size according to the norm. When you cut it, we separate the finish book and also the borders that we don't need anymore.

The point is this cutting process, we use a punching machine . This tool that is installed at the end of this punching machine, also wears with the time, and the quality of the cut also starts to be not very good at the end as it was in the beginning. What we are trying here in this project is how to make the assurance, what's the perfect time to change the tool and that with that term.

Here's a picture of the end product, the passport . Here the borders that were cut. I'm going to present a bit more the tales of a sketch of the machine, how that works, and what was the original idea.

But first, we have our original architecture of implementation of the data. We have a machine with several sensors, sensor number 1, 2, up to any sensors that we need to measure. We bring all the sensors to the machine PLC, that is the controller of the machine, and then you just mirror this data to the master computer, and you mirror this data again to the database.

That was the first original implementation that we had. The database will have a lot of data, and then it starts to make the analysis of the machine and try to understand what is happening in the machine, and in this case, what is happening in the punching tool that is cutting the passport .

When you think about the sketch of this machine, so we have this servomotor that turns this wheel . With the mechanical linkage here, we can move up and down the punching tool . At the end, we have the tool, the cutting tool that has exactly the end shape that we need.

This tool here, with the time, we see it wearing. It's not that sharp anymore, and then we start to have a not good quality in the product that we are producing. Then you need to change the tool to make it sharp again.

Good. How you can be sure that this tool is good to cut? We measured the position of the servomotor, and we measure also the torque of the servomotor, and we bring all the data from the position and the torque to the controller, then you get, as I presented in the previous slide, we mirror the data to the master computer, and then you mirror the data to the database.

In this industrial controller, it's not continuous. The curve is not continuous like here, but it's discrete. In the end, you need to think about a measurement of every CPU cycle or the clock tick of the CPU. In this case, we get all this data and it transferred to the master computer. Then we make the analysis from the database.

But the point is we realized that using OPC UA, not all 100 % of the data comes. This is a scenario that everything is fine. We have all the points inside of the server, inside of the database, but sometimes we have missing areas. We have like a lacon that data is not coming. We realized that we have only 95 % of the data. 5 % of this data is lost when you have a CPU cycle of 100 hours.

Well, this loss could be in the point that we are not measuring but could be exactly the point when you have the peak . When you miss data like here and you miss data like here, we compromise our measurement of the tool.

Even with only that, you have a data loss of 5 %, and then you have not 100 % of the data, you have 95 % of the data coming to the storage, but 95 % of the data coming from storage for all the sensors that we have in a machine, for all the machines that you have in the production process, it's a lot of data. Then you start to realize that after a year, we have a lot of database storage amount, and this is something that you want to reduce.

With this original implementation, we still have that missing data, normally missing data in the points that we need to measure. Then you had open questions about this implementation. The first question is, is that possible to measure this tool in a reliable way using the motor torque? The other one is how to reduce the amount of data that you're sending to the database?

Good. Then the first idea that we had, we decided, "Okay, we won't check the data from the database. We're going to collect the data directly on the machine with a different method that we won't lose the data. 100 % of the data come to the computer because you're measuring exactly in the machine controller.

Let's do this experiment a lot of times for different sets of the machine. Let's see if the curve has the same form and if this curve changes a bit in amplitude when you change the scenario in the machine. At the end, you had four scenarios, and you're doing this extensively this test in the machine, and you realize this is the result of this experiment.

We tried old and worn tool, so the tool was not that sharp anymore. We had a passport with 32 pages. We have two products. It's a passport with 32 pages and the passport with 48 pages.

The client can order quarterly to, "Okay, if you're going to travel too much, then order 48 pages." We tried with old and worn tool, 32 pages. We tried with old and worn tool with 48 pages. Then we changed the tool for a new one, and we repeat this experiment with the new and sharp tool for 32 and new and sharp tool for 48- page product.

This is the result of the curves. We realized that all the curves has the same shape, and this is a superposition of a lot of curves that we tried, and the variation is very small. But we can also see that we can see very clearly the peak value for the old two 48 pages is a bit far from the old two 32- page. Also, new tool , the peak value is shorter because you have less force to cut.

This is about the torque in the motor. When you have less force to cut because it was sharp, then you have an even lower amount of torque.

Good. With this, we got some information. All the scenarios present the same shape of curve. The curve is in the same shape, and we realize that "Okay, then I don't need to record all the curve. I can also only record the position of the peak." This is what is interesting for us for this new implementation that we are proposing here.

The peak value could be used for two different things. The peak value could be used for the tool wear monitoring. That is the original idea that we wanted. Another thing that for us is also important is product classification. You can also check the quality of the product if you are producing a 32 or 48 page is a safe way to say the product has 32 or 48 page.

Good. Then what is the difference? The difference is the implementation directly in the controller of the machine. The whole sketch of this machine is the same. Then you get the data inside of the controller in the same way.

But what we made here different. We preprocessed the data, we filter, we made a window here . In this window, we search for the peak. When we find the peak, we get the peak of the torque and in which position of the motor this peak happened. Then we just transfer one set of data, not the whole curve of the machine.

How that works in the end? The original implementation that you saw, per sensor in a machine, we had every year, 11.7 gigabytes per sensor. That was quite a lot. When you think that we have several hundred, almost thousands of sensors in a machine, and we have more machines in our production area, this is something very critical for us.

With this proposed implementation, we have everything very similar. The sensors go to the machine. But inside of the machine, we do a preprocessing. We filter just the meaningful information that you need, and then it transfer less data to the master computer, and then it transfer less data to the database . It made our analysis just with this less amount of data but the meaningful one.

In this case, it reduced more than a hundred thousand... No, a thousand times less. Now, it's 8 megabytes per year per sensor. This is a good implementation.

This was implemented in JMP and JMP Live. I'm going to give the word back to Günes , so she could keep explaining the next steps, what we did afterwards.

Thank you, Luis. How we generated information in JMP with this analysis is like everyone else. We started analyzing our data in the JMP first, and it was easy to analyze also our huge data sets, like 20 million data sets in JMP. But then when we decided to get just the peak values, we were able to create our reports also very lighter and very informational. Then we decided, okay, when it's so good, then we decided to send our results to the JMP Live.

Right now in JMP Live, we have the following reports, and it is generated automatically every week. There is a meeting every week for the machine colleagues, and they look at this report to decide when is the time to change the tool.

Here you can see different machines. We have six machines of this kind. Then you can see our peak value for the torque, and then you see the development through the weeks.

Here you can also see when we have a tool change in machine 1 and 2, you c ould automatically see next week the values of the peak starting again from a lower point of view, which Luis already explained why is it happening.

This is our JMP Live report that we create our planned change time for the tool. If we go to the method that we are proposing... I want to tell you again how we started going toward this use case. We started, like every other use case, first of all, defining our project requirements. Then we took all the data, like many of the other industries also trying to do in industrial of things.

We said, "Okay, we need all the data." We tried to take all the signals from the machine. We analyzed it somewhere different. Then we looked at the data and we said, "Okay, is this good enough for our quality of the information? Does it meet our project requirements?"

It wasn't meeting our project requirements because of this missing data. With the missing data, we weren't able to see the right data to have the relevant information. Then we said, "Okay, let's go to the machine and understand the process a little bit better. Why is this happening? What can we do about it?"

Then we started doing these experiments that Luis explained on the machine directly, and we collected the data locally. Then we come back to our analyze process, and then we said, "Yeah, now the data is good, the quality is good."

Now, we also ask the question, "Okay, is this all the relevant data? Is there a way to reduce the storage without reducing the data quality?" Then we decided to implement this preprocessing algorithm directly at the machine to reduce the size of the data.

What we are suggesting for you, too, is when you start a use case for the production processes, after defining your project requirements, it is better directly go to the machine and start doing experiments there, and then collect the data locally. When you first do this step, you will spare yourself a lot of time to create the architecture to be able to get all these data somewhere else.

Also, you will spare yourself lots of money because you maybe don't need that much space in your servers and et cetera. If you start directly here, you can go all the other steps, and then you will be able to get a result, a use case that works the best, and you will have less time for that.

If we summarize our lessons learned and benefits for the use case, we can definitely say an application-oriented approach is very good implementing use cases for production. You really need a deep process and machine understanding for the industrial of things use cases.

It will definitely will be better for you if you create a team of engineers, people who are working at the machines, and also the data people together, because you need a really deep understanding of what's happening, what you exactly need to be able to get a benefit out of it.

Our personal benefits for this specific use case was to create a method that we can use for other machines and processes, which we are also sharing for you today, and hoping that you can also use it for your processes. Then also this method that we created for us was able to use in other machines and other punching processes that other machines have.

Also, we had a really good knowledge at the end of this use case about the tool wear state for us. We could also increase our downtime because instead of waiting for a tool to be worn out, we were able to plan our downtime. That means automatically that we were also decreasing our costs.

On top of it, we were also able to use this method and this analysis for a long-time behavior of our tools, which also a great thing because at the end, we were able to have a predictive maintenance use case. A s a cherry on the top, we were able to reduce our data storage needs significantly.

In today's world where we talk about the energy, it's very important to have just the relevant data in our servers because it's more sustainable, it's more energy efficient. We were really happy with our results, and we are hoping also you will get some inspiration out of our method, and maybe you'll be able to use it for yourselves.

Thank you for your attention, and this was our method. Have a nice day.