Should I Bring My Umbrella to Manchester? Analogy of an Industrial Process Monitoring
At the heart of industrial processes, more and more data is being collected and stored. This data is the mirror of process behaviour, and identification of its content in real time is the key to success.
Using a case study based on meteorological data, all the steps needed to maximize the impact of industrial data and decision making as close as possible to the process are addressed.
The use of an API Rest, or database access, to obtain real-time information is therefore crucial. In this case, the HTTP Request is used to regularly upload the next seven-day weather forecast for Manchester. Then, after automatic data preparation from JSON Parsing to define specification limits, JMP Live reports are automatically generated. These reports can include interactive visuals or statistical monitoring tools, such as control charts. This SPC monitoring follows a set of constraints linked to the design space of weather acceptance and includes alarms if drifts occur. Via email, these alarms notify everyone concerned about the existence of drifts so that they can react accordingly. In this way, the entire SPC chain of data access, data preparation, reports, control chart updates, and live communication is fully automated within JMP Live.
Hello everyone, and welcome to this presentation named Should I Bring My Umbrella to Manchester: Analogy of a Process Monitoring. I am Florence Kussener. I am Principal Systems Engineer at JMP, and I joined JMP in France 12 years ago.
The purpose of this subject, I wanted to create a case study which reproduces all the steps of process monitoring that you may have in your daily work. From definition of the problem, collection of data, all the analysis step, the sharing of the information—which is very important, and I will spend time about that—and all the update of all these different steps. One thing that is important also in process monitoring is the improvement. I will spend all the time about that. It is my agenda for this session.
Following that, what I wanted is to have some accessible and regularly updated data. For this process, I use some weather data. The definition and the situation assessment data I have for this session is that when you have a journey to prepare, you have to pack everything. The question is, should I need an umbrella or do I need a coat or whatever? To answer to this question, you need to check the weather.
The usual thing now that we have, we have some application on the phone, on the computer, where we can check the prediction of the weather. This is exactly what I will do with everything in JMP is, can I reproduce all these step in JMP?
Previously, I first need to define my design space. Usually in the daily work, what we have to do to define your design space, the best approach is to use design of experiment. Whatever is the design of experiment. In some situation, you may have some data. Perhaps your historical data can give you some insight about your design space. If you have some data, if they are not good enough, you can make a combination of your historical data and your DOE.
At the end, the important thing is to find where you can have some variability of your input to reach the quality of your product. In my situation, it's a little bit different. I created data and this process during the summer, so no winter data. I can't create a design space about weather data. It's not possible. I need to explore different indicators, so I choose for my situation, temperature and precipitation.
What I did, I created a space filling uniform design, and I filled the data with some a priori information. You can see that I have my temperature and precipitation, and I write my feeling between 1 and 10. We can see that we have some acceptance area that are highlighted there in green here by my two square.
The first one is this one, and it's the one on which we will focus today, which is with a rate of precipitation not so high and a temperature between 8 and 30. Clearly, I'm rapidly cold if I am below 8 degrees. More than 30, I can be okay, but it depends on situations, not always acceptable.
We can see that the other area there is more when we are around zero and with a lot of precipitation. It's due to the fact that I am fond of skiing and snowboarding, so I'm very happy when I am in this situation. But in Manchester, it's not the case. We are not in the Alps. We will not do some skiing or snowboarding. I'm focused on this area, so precipitation below 1.5 and temperature between 8 and 30.
After, I need some data. Usually when you are data that are dated regularly, you are with database, so your UDBC database is, I would say, the month command, but not only we can have some data like. Some people are using Excel or CSV as database.
In my situation, I will use something a little bit specific. I will use an API. This API will enable me to connect to the website, which is Visual Crossing website, and I will make some extraction from this website by using HTTP request.
Just have a look on the website, so you can see what is in the website… Should open my website. No idea why it's so long today. On this website, just to summarize, while it's launching, this website is very nice because it has some different parametrizations that I can make inside this website.
First, you can choose which city you are looking for. For this session, we are looking for Manchester, but you can choose your city. You can choose the period of time. Do you want 15 days, do you want date range, some dynamic periods, only two days, the last seven days? You can take the last seven days or the next seven days if you want. You can choose your period, you can choose your metric. I am French, so I am in Celsius, but you can change that.
You can also say what is important for you. Do you want information daily or hourly? What is the indicator that you want? I pick up some, I didn't take everything, but you can see that you can have the humidity, the precipitation, precipitation probability, do you have some snow or not? Do you have some feeling about the temperature which can be different as of a real temperature? You have the wind, solar radiation, and all of the description of it.
Here you have the data table there. You can find the data table in other format, and it's the thing that is interesting for me because I can have the data as a JSON, for example—I think I'm interested by that—I can have already some chart. But what is interesting for me is the fact that this website is creating for me the API. It's a URL that is related to all the parametrization that I did previously.
Here I have this URL and I can use it. If I want to use it, you see that in URL, you need what we name the API key. You need to sign up and to create an API key. It's what I did on my side. You can see that I generate an API key. That is the name that I put there. I specify also variable for the city because my script can change very easily. If I want to check what is happening in Paris, I can really change things very easily.
Here you can see the URL, which is a little bit more complex than the one that we had previously. You can see that I want the next seven days, so it's prediction information that I am collecting. You can see that which parameter specifically I want. I didn't take everything. I just take the single on temperature, precipitation, and the wind, and solar radiation also.
I say that I want the day's information, but there's also hours information. Here you see that I have my API key that is put there. The fact that at the end, I want an output as a JSON file because JMP is managing pretty well with a JSON file, so it's what I request.
Now I have my HTTP request there where I say that I want to get the information from the website. After, I just have to pass the JSON file. We can see that we have two levels in this JSON file, one which is related to the days. For each day, I have hour information. It's why I have a second level for the third element there. You can see that the hour there, I have, again, information.
Here at the end, I have my data table. You can see this screenshot has been done in December. You can see that I have the seven days they're collected and the actual day, to be honest. You can see that this is the collection of the fifth of December. You can see that I have from the fifth to the 12th, so seven days. But I also collect the real information, so what happened yesterday.
It will enable me to make some comparison at the end. We can see the variation of precipitation, so some outlier about precipitation and the variation of temperature. This week of December, the scheduled temperature was between minus 2 degrees and 12 degrees.
For the analysis part, I created some different tools. You probably know some of them. I just created a graphical there where I put the real temperature and in yellow, the filling. We can see that sometimes we have some outlier of the real temperature in comparison of what we are filling.
Here I have also a heat map that give me the information about change of weather. You can see that the first three days are colder than the following days where we have higher days there. This one is for temperature. I have a colon switcher on this one, so I can change if I want to check other things, and I have the equivalent as control chart.
We can see the same. Very lower temperature at the beginning of the next seven days and the following days. What we can observe is the fact that I have my spec limit, that I put as control limit, so I can check rapidly when I am outside of my spec limit and I have some alarms. As soon as I am below eight degrees, I can see that I have alarms, one, that are activated, and as soon as I am upper, then a bit below 30 degrees, but it's not the question at this time of the winter. But we can see that in this part, I am okay.
Another report that I really like when I am doing some SPC is a Process Screening report which can give me an insight of all the parameter at the same time with one report. We can see that for the different elements that I collected, like temperature, the wind speed, probability of precipitation, the wind direction, the solar radiation, and precipitation by hour, I can have this report that give me the stability, some alarms, the rate of the different alarms, in my situation, it's alarm one that are important for me.
We can see that we have our Cpk and Ppk. Clearly, we have a CPK negative for the temperature. We have a good situation for this week about the speed of wind, so nice. Here, I didn't make any statement about the direction of the wind at Manchester because I have no idea what is the impact of the weather and the feeling depending on the direction of the wind at Manchester. But in that situation, it can be useful.
I think, for example, in the south of France, depending the wind, we don't have the same weather that will appear in the next few hours and days. It's a different tool that I created for the analysis and for the follow-up of the data.
Once the analysis is done, the need is to share this analysis. Usually, there's two different way to do that with JMP. The first way, which is more advanced, is to use a data table, which will be the first thing. I share my data table with the other and with the script attached, and they can see what happened.
Or you can use JSL, so it will help you to make more automation, so you can run your JSL. If you are happy with this JSL, you can put it in an add-in. It's pretty easy because you enrich your JMP environment with your own tool, and you can share them with the others.
The only difficulty with that would be, I would say, the difficulty to maintain in some situation. If I update my JSL or my add-in, I need to share it again with the others. You need JMP to do that. Sometimes control chart need to be shared with people that are not doing analysis. They just need to have a look of what is happening, what is the defect rate, and what decision we have to make in consideration of this defect rate.
In this situation, when you need to share with not JMP user, what is done is to create some static reports which are easy to share, but you have a restricted usage about JMP features. All the interaction is lost, surely. You have static report and it's handling. You need to make some handling work around that.
What I use in my situation, I use web report through JMP Live. The big advantage of that is easy to share. I keep the flexibility of JMP and the interactivity of JMP. I can share that with JMP user, but also with not JMP user. I can make a restriction of who I have access to this report. There is some security that are important there.
The thing is that all the workflow can be validated. In this situation, when your workflow is validated like GSL, but for pharmaceutical company, I'm thinking about them, but the validation of all the step is important. It's pretty nice to have all this workflow validated on the server, and it will share everything.
You can see that you have the same report which I did previously. Let's go on this report and check what is inside. Here I am opening my web page. I have a security to check. Do I have the right to check this report? My name is already saved there. I remember myself. It will be okay. You can see there, I am Florence. I am the owner of this report. I have the different report there. We can see the same.
We can see when this report has been updated. Today, we are the 19th of January, but you can see that my report has been updated the 15th of January, I will come back about this point later. You can see that, for example, if I have a look of what is happening. For the temperature, we can see that in Manchester, this week was scheduled this prediction data.
This week was scheduled as pretty cold till this weekend, so this weekend we have a switch of temperature that will happen in UK. We can have a look of what is happening about precipitation. We can see that temperature are better, but we can see that we can have some rainy day there, very rainy day there on the 21. The solar radiation will be the opposite. If we have rain, we have clouds, so less solar radiation, and we can see what is happening about the wind speed, so yes, the 21 will be rainy and windy.
It's the first consultation that I can do. Surely we have the other report, like the Process Screening report that can highlight these different things. You see that I removed for this report at the end the wind direction, so we can see that we are pretty good with the wind, but the temperature is very, very low. We have our Ppk and Cpk that are negative, so it's not a good situation. On a real process, it will be very, very bad situation.
If I have a look on the control chart, you can see that the control chart is a little bit specific. It is highlighted by this red flag there that you can also see there. It tells me that I have some active warning. What does it mean? It means that the warnings that I activated in my JMP environment are also appearing there. I can see that, yes, I have some warnings there. I have only a few hours where I will be more than eight degrees on all this prediction of all the week.
I have some warning, but not only on temperature, we can see that, yes, we need an umbrella for this weekend because we have some precipitation that are scheduled for the weekend. We can see that we have no the solar radiation this weekend, and we can see that, yes, we have a wind.
I would say the analytic part of my heat map, previous heat map, but here with more analytics with my control chart there. I can see that, again, I have my warnings there. If I have a look on this summary, I can see what is the proportion of warning in each control chart. Here I can see that I have a very high number of warnings, 94% of my point are outside of my control limit, so it's a bad situation, as I said. For the precipitation, I see that I have only 3 that's 6. For the wind speed, I have 10%.
I activated in this example only the Test 1 because it's the only one that makes sense, but you can use the other one, like do I have some increasing or decreasing? In my situation, if I use increasing and decreasing, I will have a warning every day, but it makes not sense. It's the fact that I have some change during the day of the temperature, for example, so I did not activate them, but you can choose to activate whatever, whichever warning you want.
This [inaudible 00:20:45] is collaborative. It means that if you have some specific situation in your control chart or whatever the report, and you want to make some comments, you can hear, describe what is your observation, and you can also tag some other user for actioning.
All the comments are stored there at this place. You can open that in JMP. For JMP users that need to make some more advanced work, like a root cause analysis or whatever, they can open that in JMP. It will download the data set in JMP, and so you can make some data exploration. If you are interested by this report, you can bookmark it. Here I have a set of bookmarks reports, so my report of some colleague or user make some very nice report that I bookmark there. You can share that.
What I wanted to show you very specific to control chart, you can make some subscription. I will come back about that, but making some subscription control chart means that you will be informed when you have some warning, so you will receive an email. But let's see that a little bit later.
Here, I have my four report that I can share with whoever wants that. If I come back to my slide. You saw that the fact of this report is to have the sharing and to have the collaboration across the organization. But one thing that is very nice, although, that is you have the ability to automate the publishing. It means that I just mentioned you that you can subscribe to the report, but when you did it, so let me show you what happened.
As I'm an owner of the report, you can see that each time the report is updated, I receive this kind of email. That's saying me that the data post has been refreshed successfully. It's just the fact that I am the owner of the report, not everyone receives that.
Regarding the control chart warning, people that subscribe to the report will receive that. You see that on Monday, when the data has been refreshed, I receive this report that said me that I have some control chart warning. I have exactly the same information that in my report. I see that I have a warning rate of 94%, what we saw just previously, and we can see when the warning is arriving.
I know that it will happen now. We're in a very cold situation, so it's now. If I ever look, for example, for the other one, we can see that the precipitation, they send me in the report. On the 15, they said me that I will have some precipitation, but in few days.
It can give you some insight about is it a crisis that I already covered or is it something new that is appearing in my control chart? If I click on the link that is there, I go directly on the web page. It's not me now that is going to open… We're going to open the control chart and check if something is happening. Now it's control chart that is coming to me and say me, "Oh, something is happening." If nothing happened, surely I don't need to spend time about that.
How does it work? In fact, if I go back there, it's opened directly on the control chart. If I go back there, you see I have different file related to my… I have my different report and I have also the data set. This data set, you can see I have a history. This history is there. You can see my history, previously, beginning of December, I was collecting information every day. You can see that it had been updated every day there.
At the moment, I decided to change thing and I put a scheduler every week. You can see that now I have an extraction of the data every week. Why? Because I realized that the website, I have the right to access to a certain number of data. At the moment, I go in the payments then, so I didn't want to do that. I decided to put the update of the data every week, knowing that I don't go in Manchester till March, so I don't need to have a daily information about that on a daily email.
How did I do that? Here, if I go in the setting, you can see that here, I have the information that I put the fact that I want every Monday there at 10:00 PM, so 10:00 PM UK time zone. In my time zone, it's 11:00. That's why I have an email at 11:00. I can see that I can repeat that. If I want not every day, but with a more real-time frequency, I can make some adjustments. I can go even five minutes if I want, or every hour, or every 12 hours. It's as you want. I just put that.
You can say that you want it every hour, but only on the working hour, so not having information during the night, for example, and you can make an expiration date. You can have several schedulers. We can say that we have a scheduler, which is every week, and a scheduler that is every hour. You can have some mapping thing, depending for the weekend, for vacation, or for daily work.
On the left side there, what you can find, you can find a script. This script is owning all the data access and data preparation that I did. You can see that I have the access with my API keys, the town there, I have my spec limit that I set there, and I have the URL, and you can see exactly the same code that I show you in my slide previously with all the data preparation that I put, including my spec limit in my property and everything.
All my data set, all my data table is ready. At the end, I have to say to JMP, which data table I will use. It's just because I have some different data tables, so you need to finalize by a reminder of your actual data table. It's my refresh clip that is used there and that is used for the update. Here you can put the source script, sometimes it's different, sometimes not. You can have the source script there. The assigned credential is more when you are connected to a real database, you can have to put some credential to some people.
Here you can see that you can view the data, surely, so it will open a grid of the data, but what I wanted to show is you can ask to refresh the data. When I am doing that, you can see that the refresh script is running at this moment. If I go back now to my data table, you can see that the different report are regenerated. I can go back now to my control chart builder, for example, and we can see that now I am having the information from now to my next day.
What is happening on my mailbox is that I am receiving at this moment some email saying me that I have a refreshment of my data set there. The other email that I received is that I have some control chart warning, and you see that the warning rate is lower. We are now the 19th at 11:38, and I have less warning, and yes, the temperature seems to be a little bit higher. It's very convenient to have this automation through JMP Live to be able to make a decision rapidly and closely to a real-time decision, depending what is happening on the data.
The last thing that I wanted to cover before concluding the improvement. When you are in process monitoring, even if you define very well your design space, you may have some crisis, or you may have some different situations. If you have some crisis, you are in firefighter mode, but perhaps you are in continuous process improvement direction.
Perhaps you have a change, so an equipment that may change or whatever, raw material. You need to make an analysis, a design of experiment to check if things are okay. You can have new constraints, so for example, a lot of company are now impacted by ecological constraint about the oxygen, carbon that are appearing in the process of manufacturing.
In some situation, you may perhaps to define which metric is critical because collecting data has a cost. Sometimes you have data that are a lot correlated or that you are collecting that have not so important. Perhaps you need also to define which metric is critical. In my situation, as I have prediction data, in my situation is a bit different. I can't do a root cause analysis, but remember that the cost of data acquisition is costly if I have too much acquisition.
I started thinking about, do I really need to collect data every day or perhaps every week is good enough? What is the prediction quality, in fact. I collect the data and I check what is happening, what is the difference between what is the real data, what is the reality, and what is the difference with the prediction.
Here, the lighter the color is, it's more day of prediction. In this situation, I was not able to make a lot of conclusions, so what I did, I created just a distance variation by day prediction, so we can see that here, it's my target, and we can see that if I'm looking to the first day, so what will be the prediction for today, I am here with this hard red, I would say. We can see that as we have more and more day for prediction, we are lighter and lighter and the distance is higher. It seems that, yes, we can make perhaps an effort about the day, but not too much because otherwise, I will have, as I mean, a big difference of temperature.
Here I collect data and all one month of data, what I collected, I observed that the day when I have the information of the temperature for the day, I focus on the temperature. I can see that I have as median less than one, and it is the case till today. After, I see that the mean is more and more until two for six day.
But what is important for me is my feeling. The mean is one thing, but I wanted to focus on the maximum of difference. If I have a look on the maximum of difference, we can see that the day, one day before, or two day before, is roughly equivalent, and after we have a step there. You see here it's the maximum. It's clearly equivalent, but after, we have really a step. It seems that the three first day, I can collect one measurement every three days.
When I have a closer look, we can see that here we have some outlier on our three first day. If I have a look on this three first day, we see that we have one day that makes some difference that are outlier for the two first day. Each time is the 25th of November. I collected all the November months, so the 30th of November. We can see that after we have another day that is appearing of difference, the 21 of November.
If I have a look of the 25th of November, clearly we see that the reality is far away from what was scheduled, very colder. If I have a look of all the months of the data, we can see—so it's only the real data measurement—we can see that yes, the 25th, there is a very big change of the temperature. It even happens that it's happening in the news. We can see that the first system for us for much of the UK and the temperature dropped below zero. If you have a look, it's the 25th of November. It seems that there is a very big change that was not scheduled.
At the end, what I decided to be sure, to not make an error about this outlier, I remove it. When I remove it, I see that there is some difference between the prediction for today and for tomorrow are pretty close, but for one day more, we start to have some difference.
I decided to stay with one day each other day, so every two day collection, and we can see that if I make a 2K test, yes, the mean seems to be not different between if I collect every day or every two day. At the end, it's something that we can change. I show you previously the web page where I can say, "Okay, I want to collect the information every two day and not every day." For the moment, it's every week, but you can make the change very easily.
To conclude with this example about weather, I was able to reproduce all the process monitoring steps. From the definition of design space till the update. All this step has been updated. Once I have defined, I make the update with the data collection, data analysis, and the sharing, which is automatized every day.
From the improvement, so usually when we are making some improvement, we are working on the definition of the design space. In my situation, I use an improvement approach to explore the data and to check what is the most appropriate connection to the database.
I hope that you saw some correlation with your daily work on your manufacturing site and all your process control. If you have any questions, I will be very happy to answer them. Have a nice day.