We provide some friendly banter and debate on the role that JMP and JMP Pro are likely to play amidst modern artificial intelligence (AI) and machine learning (ML) trends. In the evolving landscape of AI/ML, scientists and engineers across industries need intuitive tools for data exploration, model building, and deployment.
JMP and JMP Pro, as no-code platforms, simplify and speed up the process from raw data to AI with visual, interactive capabilities that enhance understanding and efficiency. This presentation highlights their strengths through real-world examples, including deep learning for tabular, text, and image data, showcasing how these tools enable seamless, AI-driven insights with minimal coding expertise required and potentially significant savings in time and energy.
Hi. Welcome to the talk. It's Florian Vögl speaking here together with my colleague, Russ. We're dialing ourselves together from Germany and the US to talk about the topic if JMP can really play a significant role in the future of AI and machine learning. Right?
Good question, Florian.
Obviously, all of the answers will be biased today, but I think we have a few good points to make, and particularly, I think we have some exciting examples to share with you. We think it's particularly powerful. JMP has always been designed to be helpful for scientists and engineers and has evolved to incorporate a toolset, like end to end toolset from raw data all the way to, we call it, shareable insights and, I think that includes all you need to play a significant role in AI and machine learning landscape.
I know Florian too, our customers, scientists and engineers are very busy individuals and a lot of the modern AIML, there's the impression out there that you have to… The only way to get to it is with coding, and we're going to try to break that today and show how through JMP Pro we can actually access some very powerful methods. Some of them basically almost cutting edge right through an interface and get the answers much more quickly than having to write one-off code all the time.
Yeah. It's one of the powerful aspects and, that all in combination with all the powerful stuff for scientists and engineers that JMP already incorporates. Enough talking. Let's go by example. Russ, I think you have a great example prepared for us to start off.
Yes. I wanted to highlight there's a big wave now in the chemistry world around predicting [inaudible 00:02:06] just basically from a structural, description, often in a SMILES string, and I wanted to highlight a really nice example. Just came out a few months ago here where they're studying, the effects of ice recrystallization, and they're trying to see these effects of freezing on molecules. The research was done a few years ago, and they got it through all the publication hurdles, and it's a nice paper now here in Nature Communications, and they've also made the data nicely publicly available in this, GitHub repo. This is where I got the data.
The paper, we won't go into too much depth, but they also have. I thought they did a very thorough job of, they basically focus on classic (Q)SAR approach where they assembled several different descriptor sets of the molecules and then ran… They tended to use leave one out cross validation and then presented results. For example, this first data set is called Glyco2 and these are the results here in this table show the final columns of the performance metrics for mean squared error in Pearson correlation for a couple different feature sets, and then they also do a nice thing where they ensemble, and you can see the improvement from ensembling will take the correlation, say around 0.4 or so all the way up to above 45,46.
This is a dataset I've been reanalyzing in JMP and with the JMP Pro Torch Deep Learning Add-In and here's the data where, in this case we've got, 223 molecules and the first thing I did was actually run. We have 2 nice add-ins now that are relatively new. The first one is called the material informatics toolkit and here you can just put SMILES strings and create the pictures that you see here. You can also create RDKit based features and then, the authors also provided their feature set.
This already is a nice rich data set for classic (Q)SAR. We don't have time to go into too much depth for him, but it turns out there's now several competing approaches for how we might predict this outcome, and the classic (Q)SAR would just take the numeric features as usual, input them into say some kind of, basis, say a neural network, or even you can use XG boost or other tree based methods. Just input these and go, and you do want to set up some kind of cross validation like they did. We tend to prefer like maybe a fivefold as opposed to leave one out, but I don't think it makes too much difference in the end.
But now, the really interesting thing is we can do these more modern machine learning type approaches, and today I wanted to mention two of them. The first is called, it's a graph neural network that was popularized. There's a package called Chemprop that builds on RDKit features and the idea is you build a graph neural network that mimics the molecule and then predict the outcome once you set it up, and we've got that now in the Torch Deep Learning Add-in. The input is just a SMILES string, and then you give it the output and that's all you need. It's a pretty easy setup, although sometimes tuning the model can take a little time, and we offer quite a few options for that. We won't go into today.
The latest thing we've got, which is just brand new now that I wanted to mention is, this model called a transformer CNN, where, it's using two, I'd say arguably maybe the two most popular neural network, mini architecture components transformers and neural nets and putting them together into one model.
Here, we're doing the transformer piece using a classic BERT model again with SMILES string input, and then we pipe the output of that into a 1D convolutional, a series of 1D convolutional layers and bottom line is we're able to improve on the results of the paper by a little bit. I won't go into all the details, but the paper mainly focused on classic (Q)SAR, but by now combining with these modern methods, we can improve the results even further and all possible now in the torch add-in just without basically running things through mouse clicks. This is an example output that you might get, say from one of the models and then the ensembling and everything else is fairly easy as well. Really excited about this particular application, Florian, I think I hope already maybe we've already made our case just with this one example of using SMILES string.
I think absolutely that's something that sounds, at least for me, I'm biotechnologists, I have never learned coding scratch at university. So it all sounds a little bit like something that you can do, but it's very tough and just to add maybe to what you showed that the packages are, let's say, industry standard packages that are built in, say, Python or these are already methods that are acknowledged and that are used, but they're now accessible through this nice interface where you don't have to use any coding at all, and you can still implement the output as you would be able to if you would be coding. That's a nice thing, I think, particularly for us.
Going further, I think JMP actually adds a little bit of magic to the process because our philosophy of always having tight integration of the modeling with the graphics, and then we've got things like the profiler, which I think you're about to show us that, we can now dive deeper, not just fitting a model, but actually looking at it and then making predictions or any what if scenarios. This is exactly the kind of thing that scientists and engineers struggle with all the time.
It's a very good point. Let me just quickly share my screen and my view on, let's say, the particular aspect of why I think JMP plays an important role in the future. On the one hand side, these advanced techniques seem to be a must-have, a must-do exercise, but on the other hand side, having them accessible, and especially as you say, scientists and engineers are always not only interested in like a pure prediction, but they always seek to understand why are things happening.
Especially research and development, but also in production settings, it's important to get that understanding of how things interact and one of the really cool things in JMP is this specialized techniques like we see here, Functional Data Explorer, that allows us to take a look at complex output like a curve. In this case, we have a chromatogram and the particular question here is how can we get as much of the green stuff here, that's product, without having any of the red stuff?
This platform allows us to make some predictions, but it also allows us to explore the settings, things that we can tweak in our process and see how much or how little each of them relates with the desired outcome and not only we can also combine that with, let's say, again, more black box kind of modeling to get a prediction that tells us then, "Okay, just run it like this and then you'll get a good outcome."
I think this is, one corner, say a little bit on one end, but then also JMP offers just very powerful capabilities, as most of the people that have used JMP know and that is, if we take a look at such a data table here where we have a lot of batches, fermentation batches, and we just saw that we can take a look at these curves, investigate them and see how their shape is related to input factors, but again, we can also combine all that information into one batch fingerprint and that's just as easy in JMP.
I have this little button here. Need to rearrange myself. This is the table I was looking for. All you need to do is do some little transformations to your data table, which is easily possible in JMP and as you saw, you can also automate all of that and, from there you can use Graph Builder and, you can just create your fingerprint. I think that was not accurate, and then you can color it. Make it a heat map.
Now you have a fingerprint, which is at first glance is an average of all batches, but you can also now create one for every batch, and instantly you have created a fingerprint, which is specifically highlighting the values in that image for each of the batches and that is, I mean, it's down to the subject-matter expert to decide what you want to have in there again, but that is now easily accessible, not only for you to get an overview, but again, you can easily put that back into a table, join that into your original data.
You can also use for, let's say, top end machine learning image classification, where you can have a prediction of a yield or some outcome of interest and I think that makes it a very powerful toolset. You can see how quick and easy you can get from data transformation, visualization to creating predictors of images and how you can use them. I think Russ, you have another example of that.
This is incredible for him because all these new possibilities are just coming out. JMP was already very powerful. We've got a lot of people that just love it. Now we can do not only like we showed with the SMILES string predictions, but now with pictures as inputs and as data, this is just opening up a whole new world of possibilities of things we can do, and as you've just shown, for example, and this is something you wouldn't maybe ordinarily think about, especially in JMP, would be converting multi or functional data into an image like you just showed easily, but once you do that, we are now into the wheelhouse of modern ML methods for image processing and the classic one being image classification, which you can also do in the Torch add-in. The idea would be in here. I've got a similar example that up to the one you just showed Florian. In fact, I created these same images with Graph Builder.
By the way, it's also easy just to import any kind of pictures directly into JMP using the file, import multiple files. Whether your image data has already been taken, say from a camera, or you want to make it yourself through by massaging or manipulating numerical features. In the end, the picture is just a very intuitive, compact way of capturing a lot of detailed data that are arranged either typically time might say be the X axis, and then you've got some flexibility on how to set up the wise, but often those have meaning too in terms of their ordering. Ten in terms of the add-in, it's very easy, almost easier than before, where all you have to do is take the picture, make it your input, and then, say your target of something that you want to compute.
Let's say in this case, maybe like a quality score. You can also do more than one and maybe some kind of validation framework, so we don't overfit, and then, the add-in presents you with some very convenient options to tap into these classic modern, image classification architectures, and you can try several different ones, compare different methods quickly, along with we're trying to code up a lot of extra options too to really help you fine tune your models, including, say, adjusting the input image size, some augmentation, and then as you get into it more, you can start to play with things like dropout and other bells and whistles to make a model that's really custom tuned for a particular application at hand, and the adding kind of takes everything into account for you.
It'll also take advantage of modern hardware, typically once a Windows machine with an NVIDIA GPU, or if you've got a Mac, you want typically want one of the modern M chips because these calculations can get very intense very quickly, especially as the number of images grow, say both the number and their size. It helps to have powerful computing at hand. Fairly affordable these days, thankfully, and then you can, go to town and make predictions and things like, never before. I think to me, there's a really exciting time for JMP Pro and, we've got a lot of customers already very exciting.
I know a few have already built models say on their images and are deploying them in practice now with python. I think to me, we've already proved our point in terms of can it play a role and already is, and I think we're just getting started. Very exciting times.
Absolutely. Thank you for sharing these great examples. As you say, we have already seen that it plays an important role. Just summarizing again, we've always scoped on scientists and engineers to help them be more efficient with using data in their daily life, and typically, these people don't have a specific background on that. On top, they're oftentimes too busy to just learn such things like coding machine learning models from scratch.
Yeah, to answer, we have all the toolset that a lot of scientists and engineers are already used to in JMP, but also we are now at a point with just a very intuitive and complete toolset already, and on top, we now have what we have seen today, these cutting edge methods available. I think to summarize it in one word, yes, JMP can play a significant role in the future of AI and ML. I guess you agree, right?
Fully agree. Not only that, we want to actually I think there's kind of this interesting ground now between the coders and the scientists, and the gap is non-trivial in many cases, and there's knowledge to be gained to both. I think if we can sit in there as an enabling tool, we can actually maybe even push the envelope in some cases, and I know writing the interface, we better wrestle with some questions in terms of what are best defaults to make, what are the best way to arrange things. I think all these help with the complexities involved and help everyone get to good quality answers faster.
I can only agree. Thank you very much, Russ. I think that completes our talk for today, but we're always happy to have more discussions around those topics.
Thank you.
Thank you. Bye bye. Good.
Presenters
Skill level
- Beginner
- Intermediate
- Advanced