Functional Data Explorer: Visualizing Results to Non-Data Scientists (2021-EU-30MP-765)

4 Kudos

Level: Intermediate

Simon Stelzig, Head of Data-Driven Development, Lohmann

In the course of adhesive formulation and tape development the developer is supported by the data scientist in handling the analysis of the gathered data resulting from design of experiments, the analysis of historical data or preliminary trial and error runs. Hence, a data specialist talks with a non-specialist about the results, facing the need to present complex data analysis in a non-complex way. Especially in case of the analysis of spectral data, the data is often reduced to a few characteristic numbers for analysis. However, the developer still thinks in spectra, rather than numbers, making the visualization and interpretation of the results difficult. The Functional Data Explorer is a great tool to perform such analysis and visualize the results in a spectral-like form. As an example, the dependence of the rheological behavior of a structural adhesive tape, namely the reaction start, on the formulation ingredients is analyzed using the Functional Data Explorer. Another example is the analysis of the dependency of the molecular weight distribution on the process parameters during the synthesis of a pressure-sensitive adhesive. In both cases, the resulting predicted spectra greatly help the developer to gain insight and develop a mechanistic understanding.

Auto-generated transcript...

Speaker	Transcript
Simon Stelzig	Okay, so hello everybody. My name is Simon Stelzig. I'm from Lohmann, the bonding engineers. I'm from the research and development department, and I would like to tell you something about today about the Functional Data Explorer and how you can use it to visualize results to non data scientists.
	So, before I start, some very short introduction of the company Lohmann. You might not have heard about it, but you use or you might have come across its products like on a daily basis.
	So just some facts and figures about Lohmann. So Lohmann has a turnover in 2019 about 600 million euros and the part I'm working on is the Lohmann Tape Group and we are a producer of adhesive tapes. It's mainly double sided adhesive tapes and we are a pure B2C company, so you only
	you can purchase only our product when...if you're industrial...for industrial applications. We're quite old so, this year is about 170 years since the company Lohmann was founded.
	We are basically divided into two parts, so one of the it's tape products where I'm working on. That's the focus on the
	adhesive tapes for industrial applications and that's our hygiene brand.
	And we are globally active. We have 25 sites around the whole world, with about roughly 1,800 employees. And maybe very interesting,
	90% of our products are customized so we have a quite complex product portfolio, where we have a lot of products focus on a very small amount of applications for a very small group of customers.
	And so, we are the top producer of adhesive tapes or many double sided adhesive tape and you know we got a lot of applications, and these are our market segment where we are in.
	So one of ours is the industrial application. That's mainly for windows and doors or indoor and outdoor application in the building and construction area.
	Or you might come across Lohmann products in home appliance and electronic devices, for example, like adhesive tapes for fixing the back cover on smartphones. Or in the
	medical segments for mainly for diagnostic applications, wound care. In the field of transportation in the...for automotive maybe for interior or exterior
	adhesion of emblems on the cars. We are also, for example, in the graphical industry for flexible printing applications where we deliver adhesive tapes
	to fix the printing plate on the sleeves. One of the...one subdivision of the Lohmann Tape Group is the hygiene brand where we deliver closure system for diapers.
	So very wide range of applications and another point which I just mentioned before is that about 90% of our products are customized. And another fact about Lohmann is
	that our value chain is a very deep value chain, so if you think about an adhesive tape, you need an adhesive layer which
	glues onto or sticks onto your subject which you want to adhere to.
	You need a carrier, optionally a carrier which carries your adhesive layer, and on the other side, which is since we are a producer of the double sided adhesive tapes you have on the other side again,
	an adhesive layer. And Lohmann's value chain basically starts at the very beginning, so we can do our own polymer, so the base polymer for an adhesive.
	Then we have to formulate that the polymer into an adhesive formulation, which we then put in the next step on a carrier, being it in the form of an adhesive tape in the form of
	quite large ??? rolls. And at the very end, so we can deliver that in the form
	the customer wants it to be. So, if you have some some die cuts, maybe some emblems for cars, you need to have the those adhesive tape doesn't need a specific form, we deliver that specific form as a customer requires it.
	And the R&D focuses mainly on the first three parts of it, so for the polymerization, the formulation of the coating and in order to get, as I mentioned a lot of our products are customized, in order to get as quick as possible to the goal to fulfill our customer requirements we apply
	design of experiments throughout the course
	of getting to the final product and to fulfill our customer requirements.
	And having a lot of
	doing design of experiments or doing overall a lot of experiments also gives a lot of data.
	And doing these experiments or the design of experiments in that course and being a chemical company, a lot of the data, in the experiments results, you get a lot of data, a lot of spectra.
	So, most of our experiments give...doesn't give you a value, doesn't give you a number, it gives you a spectra.
	And, in most cases, if you want to do the experiments, like in my case, I am the, let's say, the data scientists delivering the service of planning those experiments and analyzing
	experiments and doing the whole data analytics for my colleagues, who might, in that case, are the non data scientists.
	They are developers of the final product, and they take me as a service to do all the data analytics time. In the most of the cases, you extract from those data from those spectra data, key parameters which describe your spectra data.
	That might be, in the case of polymerzation, that might be the molecular weight distribution, it might mean the size of your
	polymer which we get out of the reaction, that might be a start of a reaction to detect the the sort of reaction by any means of measurements or something else.
	So if you do the analysis at the data analysis, you extract those key parameters and you do the analysis, most of the time in classical DOE analysis on those key parameters.
	So
	as a data scientist, so I, as a service to my my colleagues, the developers, do this analysis on those key parameters.
	They deliver also to me, the customer requirements and you want to meet for these customers requirements, they the optimum where you want to be your formulation or product to be or to reach.
	Having done this, having done the analysis, having done the optimization, at some point you go to your, or I go to my colleagues and I present the results.
	And since I did all the analysis on the key parameters, I also present the results on those key parameters.
	The problem is, although my development colleagues or my project teams, they are experts in the area...the experts in their expertise, but not experts of data science.
	So the problem here is that if I do the analysis on the key parameters and also talk in the language of key parameters, but they as the expert in their area of expertise,
	they are still thinking spectra because that's when the experiments were done, that's what the analysis they get, they see spectra where I see numbers.
	It always leaves the problem that if I present the results and the analysis in key parameters, they still think in spectra.
	So you always have that the problem that you do the talking in different languages. I talk in the language of numbers, they talk in the language of spectra,
	which then leads to the problem that they have to translate those numbers into their area of expertise that means the spectra.
	Or, I have to translate these key parameters into their language that means the spectra. Luckily with the Functional Data Explorer, this is a kind of a universal translator for that to resolve that kind of problem.
	And what I do, or want to show you today is on three examples, how the Functional Data Explorer greatly helps you in overcoming this barrier or this language barrier between the data scientists talking numbers and the developer thinking and talking in spectras.
	In one example, so the first example I want to show you, it's a very simple, very trivial example, maybe on a measurement which describes the printing quality in the fractal printing
	experiment. The second one describes a measurement which defines a sort of a chemical reaction. And there's a third one; it's a measurement which gives you the molecular weight of a polymer
	during a polymerization reaction. I'll stop now using PowerPoint and I go over to JMP, yeah, there it is.
	So first example as I showed...as I told you, is the analysis of the printing quality. So you do the measurement, and the ideal and perfect spectra would be a
	very straight line going through the origin with the slope of 1. So that would be the ideal if you reach that perfect, so it's one measurement, this time the printing quality.
	Normally, what you get...normal, it's a real life environment. We get something which looks like not really a straight line but it's a little bit shifted towards higher value, so it's not a line...it's bended upwards.
	So what do you want to do with that information? We want to see or you want to screen which parameters basically moves or bends the curve upwards so makes it non ideal.
	So I go to the key parameters of what would we define as a key parameters or it's basically the difference between a straight line and the actual line which you get. So it's pretty simple, you just use the
	the difference...it's some of the difference with this point.
	So if you do the analysis, then on the key parameters, it's also...it's a very simple now analysis. I use a decision tree to get the...to get the screen here which which parameter influences the most, this bending away from the ideal spectrum and it's pretty simple. We got about 22 influencing factors.
	There you see one, which basically has an influence on the some or influences this bending away from this ideal situation.
	And it's called this X 22, whatever it is, doesn't really matter. So at one point if it moves...
	if it's larger than a certain value, you get a let's say, a number of 250, if it moves below that certain value, you get 196, the problem is, if you told this to development...if you showed it to a development analysis,
	he will ask you, is 196 good or is 254 still good or is it very bad or is that maybe suitable and that sort of suitable.
	The problem is with the pure number, you don't really see that, so because he is talking in different kind of...in a different area with a different language.
	And using the Functional Data Explorer can really greatly help you to answer the question, though, because if you move over to the Functional Data Explorer and do the same analysis in the data in the function...with the Functional Data Explorer,
	you get something which shows like this. So it shows you, it shows him or the developer you showed him ...you show him the graph. The graph he is used to with the graph normally gets from his measurements or his experiments, the experiments which he does.
	The same thing now if you, for example, if he has the same parameter which influences the shape of the curve, but if you don't like that shows the one with the 250,
	from the analysis for the key parameter would be your 250, it will be 196 and you see the difference. I mean that not really a straight line.
	But if it moves down like to the lower number, which shows the number before they want to keep from, 196,
	it doesn't look too bad so...it's not really, I mean, a sort of perfect straight line, but it's not so, not so much...it's not far away from being
	a straight line and maybe at that point, if you if you show this kind of analysis to the non data scientists, the developer,
	maybe that's already good enough for him, maybe that point you say okay it's not a straight line it's not perfect, but it's it's it's good for the job.
	Basically, it fits the customer's need. And that's something which he doesn't see from the pure number which you show him. So in that case it's a very trivial example, very simple example,
	but I hope it gives you the essence that the using a Functional Data Explorer enables you to talk in the language of the experts, of the non data scientists, maybe the developer of the material.
	And you're talking in his area or in his language, helps him to get this insight and it doesn't have to translate the number into insight into into a spectra but he sees that, and maybe get the answer which he needs.
	So again, very trivial, very simple example, but I get...I hope it shows you the the essence of what the data...Function Data Explorer is or helps you to achieve.
	So, moving on to the second example, that's the start of the chemical reaction which you want to detect.
	And we do that by doing the ??? measurement and you want to see a change in modulus during the...if you heat up a certain chemical formulation. Normally there are two big spectrum; spectrum would look like that,
	where up at a certain point, the modulus changes, it moves steep upwards. That's normally here, it's the start of the chemical reaction. So the key parameter, which you are extracting from that spectra, what the point where the modulus starts sharply to go up.
	The
	key parameter to good extract is that point to find the point. Now it might be a little more complicated to calculate a point from the from the original raw data, but nevertheless, you can do that, check the point where the reaction start and do then the analysis on the key parameter.
	Doing that,
	what you get from this analysis on the key parameter is a is a model which allows you to predict the sort of the chemical reaction based on, in this example, on five parameters. So,
	I will see. Okay, the point of...the point or these these key parameters at the start of the chemical reaction is ...may depends on those...depends on those three parameters
	and basically you get the range from something like 70 degrees where the chemical reaction starts up to something about 100 degrees Celsius (oh I'm sorry, that's the wrong one.)
	to about 100 degrees, yeah. So again, you reduce the total spectra into one number,
	giving the development a clue where this chemical reaction starts.
	Doing the same analysis and also the data...you already have the data, so if you want to extract the key parameters, you also need the raw data, so you also have the raw data available.
	So it doesn't...it's not really much more effort...more effort to do the analysis within the Functional Data Explorer using a standard regression method on key parameters.
	So,
	using the same amount or the same data and using the Functional Data Explorer to
	analyze the data just takes a couple of seconds to calculate the outcome.
	There it is.
	Again, now you can model the whole spectrum, and what...so in that particular case, I didn't show the developer for the analysis on the key parameters, but I showed them the Functional Data Explore first, the results from the Functional Data Explorer.
	And when I just moved...started to move around and show them this and started to move around the the influencing factors, they immediate noticed
	that the point of the reaction changes. So I didn't have to tell them that points started to move because they are being the experts in the field...of being the expert on the spectrum data,
	they immediately recognized that basically the start of the chemical reaction shifts, as you can see here, you see that it starts to shift.
	The other one what you see, because now you basically modeled the full spectra, you don't not only model the key parameter, what you see
	that the way that the spectra looks totally changes if you move it. So, for example, they will see that the peak started to move and the height of the peak
	changes. That was something that was also recognized immediately after they saw these this analysis.
	And it was the difference is on the key parameter, if you extract the chance to one key parameters that means sort of chemical reaction, you can use another key parameters, maybe.
	telling you the position of the optimum or the position of the maximum peak here or height of the maximum peak.
	And then you can use the spectral data into different kind of key parameters analyzing those key parameters subsequently and showing them to them, calculate an optimum.
	But the good thing is, if you use the Functional Data Explorer, you all get this information into one
	analysis. And you, as I mentioned before, you stop talking or you...you're not talking in numbers, but you talk in their, or you talk in their language of spectral data and you don't have to translate these key parameters into their world of the spectra anymore.
	Yeah and so you get much more information about these these in Functional Data Explorer and maybe they get a deeper insight into the underlying processes.
	Yeah, now I come to the very last and final example. It's the
	polymerization process which I was talking about and that's basically to determine the molecular weight distribution,
	which is results from polymerization reaction. And again, as I have mentioned, to determine the molecular distribution it's also typically tricky
	to get or to reduce a whole distribution into certain key parameters. So normally, how does a typical spectrum look like? So let's put your normal typical spectral showing the molar mass of a polymer so make all the number of repeating units which you have in a polymer chain.
	And that's what you get from the chemical reaction and from the measurement. So the key parameters which are it's a well known or it's a key parameter, you have to...
	that's the number average molar mass, and mass average molar mass, Z-average molar mass and the polydispersity,
	which is the mass average molar mass divided by the number average molar mass.
	That parameter, the PDI, the polydispersity, gives you an indication of the broadness or the size of the...or the width of your of your molecular weight distribution.
	Again, having these four key parameters which you calculate from your raw data or your spectral data, again we already have that data available in order to
	calculate the key parameters. You can do the analysis again using standard techniques that nothing, nothing fancy.
	And then you get a model about on your...this case, it's four infringing parameters on your, in this case three, I didn't put polydispersity in,
	under three key parameters. And you see...what you see that the number average molar mass doesn't really change a lot if you change these four factors, so the influence on on those four factors on the number average molar mass, it's not too bit. It's quite big and it's quite significant
	on the mass average molar mass, as you can see here. So you can quite move it around it, so the difference is quite big, so it's about 150,000 ? per mole in total, the difference between at the lowest setting and the highest setting.
	But, meaning that this one, that this number changes quite significantly and this doesn't also means that your dispersity should vary or should should be...or should should change quite a lot.
	The problem was you don't see from this number is the way the distribution looks like, so it doesn't give you the shape, and let's say, in our case, it's very important to have shape because it
	affects the processability of a polymer quite dramatically. So
	having only three numbers and maybe the polydispersity, it doesn't give you or doesn't tell you, if the shape...or
	if the distribution is that monomole distribution, bimodal, trimodal whatsoever.
	What you see or you could just talking about the dispersity maybe gives you an indication of the broadness of distribution, but it doesn't tell you if you have one peak, two peaks or three peaks.
	But there's quite...makes quite a large difference in the final material characteristics which we get
	of the polymer or change the the performance of our adhesive tape in development.
	Now, again, since you already had the raw data available, you can use again the Functional Data Explorer in order to get these insight or gets much more information out of these...all of these data just from those key parameters.
	Once again, takes couple of seconds, but then you're done.
	And, once again, so there is nothing fancy about the analysis so also to the start of...to operate the Functional Data Explorer it's nothing fancy, nothing
	out of the ordinary, which you pull in your exports as Y, you put your sample ID, then you put in your factors from your DOE.
	Put them in, press the start button and basically that's it. And you can basically you now...you can fit it on with the model parameters.
	But it's a very simple, very straightforward procedure and, again, you get much more insight into the into the underlying parts of interest, just reducing all those data to some key parameters. In that case,
	I mean having shown you that the one key parameter doesn't really change a lot, the other one does, but it doesn't give you the form, so it doesn't get you an indication or the feeling about the distribution.
	In that case, using again this tool gives you that insight, because if you see how... before you see that one point that changes a lot, the other one didn't.
	If you know sort of change, though, for example, in that case.
	That peak is more like a shoulder, it's not really a peak, but if you change your influencing factors you come to a point that it's actually really gets a peak.
	Yeah, so it really gets a second peak, and that tells you something about maybe the polymerization mechanism or the underlying polymerization mechanism
	which you don't get from the key parameters, with just looking at those three numbers yeah.
	But here, it might give you an indication or you might trigger in the developer something because that's his area of expertise. You might trigger something about which helps him understand the underlying mechanism and then
	helps him to do the next step or helps him to do the next step better to plan the next next experiments even better than before, just having seen these numbers.
	Right so I'm at the end of my...presenting my example, so I could hope that I could show you that the Functional Data Explorer really enables the visualization of the
	results to developers in their area of expertise. I mean, keep in mind that most in, let's say in our case, most developers are non data scientists so
	they
	don't believe or maybe not so interested about how to get to to the results but they're more interested in the results and the
	problem there is...or if the Functional Data Explorer, let's say, allows you to talk in their language or in their area of expertise that means, in our case, being a chemical company, spectral data, and you don't have to
	throw away valuable data in order to reduce spectral data into some key parameters and do the analysis on key parameters.
	Having to explain what key parameters actually...how they affect spectra, the forms of spectra, the shape of the spectra, but you can just use the data and then
	you will start to show them these influence on the factors which you are studying on the form and shape of the spectra and how the spectra changes.
	And a lot of cases, in some cases, you might get the same insight, but a lot of cases, the analysis, using the FDE often gives you much more insight, it gives you a deeper insight into the underlying mechanism.
	And you don't have to, let's say, throw away a lot of valuable data during the process of reducing the spectral data into some key parameters.
	With that, I like to thank you, my colleagues from the R&D department for doing all the experimental work, so I can stand here in front of you to present the nice results. And
	I'm finished with my presentation. Thank you very much for your attention. If you have any questions, please feel free to ask. Thank you very much.

benfrancis · ‎03-23-2021

I really enjoyed your perspective on how to collaborate through common language @shs, thank you for the presentation. Does this effective communication make your colleagues interested in doing some Data Science themselves?

shs · ‎03-24-2021

@benfrancis: Thank you. Yes it does to some extend. And it definitely generates interest in the overall topic of data science.

benfrancis · ‎03-24-2021

Sounds brilliant @shs, hopefully there will be time when we can talk further about this! I keep my fingers crossed for Discovery Europe 2022!

martindemel · ‎03-26-2021

I agree on Bens comment. Taking the recipients language in focus rather than the data science point of view adds such big value! THanks for this nice presentation with easy to understand but at the same time quite impactful examples.