In our quest to harness Python's extensive capabilities within JMP, we faced the challenge of integrating Python's flexibility with JMP’s intuitive and interactive user experience. Prior to JMP 18, we achieved this through a workaround ‒ running Python on a server to bypass JMP’s cumbersome Python integration. However, the release of JMP18 re-introduced Python integration, simplifying this process.

Building on our initial concept of enabling JMP users to leverage Python without requiring programming expertise, we developed JMPyFacade (JMPy). This tool offers a familiar interface, since it is similar to JMP's, for executing predefined Python services via dynamically generated user interface, all while abstracting the underlying Python and JSL code.

In this presentation, we explore the technical architecture of JMPyFacade and demonstrate how it effectively bridges the gap between JMP and Python. Attendees learn how Python can be leveraged within JMP for efficient and engaging data analysis.


 

 

Hello, welcome to my poster presentation regarding Bridging JMP and Python for seamless and engaging analysis. My name is Jarmo Hirvonen and I work from Murata in Finland. In this poster, I will explain why you should utilize Python in JMP, or Python integration in JMP. Some things to note regarding the integration, and one of the applications we have developed here Murata, Finland, me and my colleague Eero, that utilizes the integration. Let's start with why do you utilize this capability in JMP?

JMP is very visual and interactive tool with good analytical capabilities. On the other hand, Python is basically the de facto programming language for many data-related tasks. It has many very well established libraries, and it's possible that your company even have their own capabilities in Python.

I see this from two points of view. We have some first users who will get access to features that are not natively available in JMP, such as some machine learning models, Torch, it's one example, and then access to file types you couldn't otherwise use in JMP.

One example from that is Parquet files, so you can open them in Python and then bring them through the integration to JMP. Then for Python first users, they get access to JMP's visual and interactive tools that they can utilize then, either directly in JMP or then in some integrated way from the Python side. Of course, there are also some analytical capabilities that you can only find from JMP. Then things to note regarding this integration between JMP and Python, Python is installed with JMP 18, that's one thing.

It is a bit slow on initial load per JMP instance. Meaning, the venue first loaded in the JMP instance, it takes some time, but on the next load, if you don't close JMP fully, it's faster. You might have to take this account if you are just doing something very small. It might be worth trying to do it in just JSL, it's faster.

Packet management and installation can sometimes cause some problems, but there is some documentation regarding it, but sometimes it can be a bit of a hassle to solve. You have only one environment, so if you are used to working with virtual environments, you cannot currently do it in job integration. You have to understand on some level the JSL and Python data type conversions, but there's a documentation on some web page regarding this. Scoping can be confusing, or it is a bit confusing. The Python is basically on global scope, and depending on how you name your variables, they can get a bit tangled between different scripts and so on.

Then finally, in my opinion, maybe that one of the biggest things here is that Python debugging and development can be a bit annoying experience, and it can be difficult to get into a flow. This is mostly because you cannot reset the Python in JMP or restart the kernel, and you end up in this type of "cycle of the development". You develop in Python, then you want to test it, and you suspect that there's something that you would want to clean from some variables for example.

Only way to do it currently is to close the JMP instance, rerun the JMPs, then load everything back, and then you can test the code. This is not a cycle, it basically breaks here, and you have to restart the whole thing again from start. It breaks the flow. That cause a frustration, you can mitigate this a little bit, depending on how you write your Python code.

Then to our application, so what is JMPyFacade? First, the name comes from the way it facades basically the complicated JMP and Python code behind fairly similar, fairly familiar JMP-like user interface. It is version control added, which gives JMP users access to Python capabilities that have been predetermined. We call these services. These range from very simple dataset creations to much more complicated, [inaudible 00:04:34] learning models.

What we want to do with this is to give access to our JMP users to utilize Python capabilities without them basically knowing that they are using Python at all. We did have something a bit similar earlier, but in that case we utilized server to bypass JMP's Python integration, because earlier it was a bit difficult to use. Now with JMP18, we can use the native integration, and it works pretty well.

Then flow of the application when user runs it, they run the add-in, then it will check if it's updated or not, if it's not updated, it will get update and user will rerun the add-in. If there're no updates to be found, it will init the LS Python in JMP. Purpose of this step is to basically create the update of orders. If user hasn't run it, you could run in something… an application like this into some problems if they do not exist.

After we have those initialized, we're going to all check from our packet management if there are packages that should be updated, if we update them if necessary, and if nothing to update, then we just continue. We check if user has a table open, and if there's a no table, we will close the add-in and inform user that check, now open a table before running this. This is done currently because we only support one single static table, basically.

Then, when user has a table open, they want to analyze, but then we import the packages. If there's an issue, again, it's a stop and then showing log, what went wrong. If it's successful, we initialize the user interface, create the user interface, and users can start using that tool. If the user interface does already exist in the JMP instance, we will just bring it to front and we do not really initialize anything.

A little bit how it works in background, we are utilizing data types, JMP classes and then on Python side, also Pydantic package to create the models we use, and for type validation, and data validation.

Here we have an example of HDBSCAN, and if we take for example this features argument here, it's a list of strings, it's a field which has minimum length of one and utilizes JSON Schema, which are numerical columns. When JMP gets this, it knows to create this type of class, so you'll be at a button, column list box, user has to input single numeric, at least a single numeric column here for this to work, and different type of element for different types of data integers, number 8 boxes, and then we have Boolean which are checkboxes and so on.

When users ready to run the service, they fill in the columns, click the run service button, and at this point, the JMP side of the code will create a subset of the data which are these columns and take the columns which are not excluded. These are then used when the Python function is called. Makes it a bit easier to manage on the Python side, that's why we do it like this.

Python will run its algorithm, and it will update the values to the subset. When the subset and the Python returns some values, JMP knows, "Okay, now we can update the original table." It will update the table, close the subset, and then also create some table scripts and column properties to the original table. Every time user runs basically a service, it will create new columns to the table if there are columns to be created.

Then a few examples of the usage, this is basic example, so user runs that, add-in, it checks from the version control, and in this case, user wants to run tSNE from this add-in, so they select the tSNE from the left side, input the columns, and they also check from the documentation, this is the official documentation from the tSNE if they for example want to know how it works, what it does, and what the parameters are. When the user is ready to run the service, they click the run service button, and new columns are created to JMP table, and now they just work with those like the normal button JMP.

Then we also view the report script, in this case it's just the input parameters. Here a bit more complicated examples, or excessive examples, or expansive examples. In this demonstration, it's an interactive mode. In my opinion, one of the best features we have here. These do not create always new columns, they update static columns, and they immediately rerun their algorithms, Python algorithms when user changes any values in the user interface. For example, you can use this export data analysis or fine-tuning algorithms.

Here, user basically clicks up arrow on the keyboard, and it will rerun the Python algorithm all the time, and the clustering will change based on that user, when users update the values. More views here, then the components change, so you will get totally different clusters depending on the values you have. You could modify values here interactively, figure out what are the best in this case, and then maybe go back to Python and use these there for example.

Then we have this feature importance service that utilizes multiple algorithms, and then it uses Shapley values to determine which could be the most important for modeling or predicting something. We run the models, we get Shapley [inaudible 00:10:32] back, and then also ranking of our selected algorithms. In this case, RGBM, XGBoost, and AdaBoost. And from here, a user could basically copy-paste values from here and then start modeling in JMP using these columns.

Then finally, some future work for this tool, adding support for the dynamic table, now we have the one static table, more services, image analysis and our own custom machine learning models. Improves the interactivity, so Python would basically create graphic scripts or update graphic scripts directly. Exporting and importing Python models. Utilizing JMP's great DOE capabilities for parameter fine-tuning. Creating pipelines, so you could run multiple services in a sequence. Improve the dynamic features we have here, and then depending on what JMP19 brings some possible refactoring, so we can further improve the tool. Then some things to discuss maybe is open sourcing this, so there could be more than just us to developing this tool. Thank you.

Presenter

Skill level

Advanced
  • Beginner
  • Intermediate
  • Advanced
Published on ‎12-15-2024 08:23 AM by Community Manager Community Manager | Updated on ‎03-18-2025 01:12 PM

In our quest to harness Python's extensive capabilities within JMP, we faced the challenge of integrating Python's flexibility with JMP’s intuitive and interactive user experience. Prior to JMP 18, we achieved this through a workaround ‒ running Python on a server to bypass JMP’s cumbersome Python integration. However, the release of JMP18 re-introduced Python integration, simplifying this process.

Building on our initial concept of enabling JMP users to leverage Python without requiring programming expertise, we developed JMPyFacade (JMPy). This tool offers a familiar interface, since it is similar to JMP's, for executing predefined Python services via dynamically generated user interface, all while abstracting the underlying Python and JSL code.

In this presentation, we explore the technical architecture of JMPyFacade and demonstrate how it effectively bridges the gap between JMP and Python. Attendees learn how Python can be leveraged within JMP for efficient and engaging data analysis.


 

 

Hello, welcome to my poster presentation regarding Bridging JMP and Python for seamless and engaging analysis. My name is Jarmo Hirvonen and I work from Murata in Finland. In this poster, I will explain why you should utilize Python in JMP, or Python integration in JMP. Some things to note regarding the integration, and one of the applications we have developed here Murata, Finland, me and my colleague Eero, that utilizes the integration. Let's start with why do you utilize this capability in JMP?

JMP is very visual and interactive tool with good analytical capabilities. On the other hand, Python is basically the de facto programming language for many data-related tasks. It has many very well established libraries, and it's possible that your company even have their own capabilities in Python.

I see this from two points of view. We have some first users who will get access to features that are not natively available in JMP, such as some machine learning models, Torch, it's one example, and then access to file types you couldn't otherwise use in JMP.

One example from that is Parquet files, so you can open them in Python and then bring them through the integration to JMP. Then for Python first users, they get access to JMP's visual and interactive tools that they can utilize then, either directly in JMP or then in some integrated way from the Python side. Of course, there are also some analytical capabilities that you can only find from JMP. Then things to note regarding this integration between JMP and Python, Python is installed with JMP 18, that's one thing.

It is a bit slow on initial load per JMP instance. Meaning, the venue first loaded in the JMP instance, it takes some time, but on the next load, if you don't close JMP fully, it's faster. You might have to take this account if you are just doing something very small. It might be worth trying to do it in just JSL, it's faster.

Packet management and installation can sometimes cause some problems, but there is some documentation regarding it, but sometimes it can be a bit of a hassle to solve. You have only one environment, so if you are used to working with virtual environments, you cannot currently do it in job integration. You have to understand on some level the JSL and Python data type conversions, but there's a documentation on some web page regarding this. Scoping can be confusing, or it is a bit confusing. The Python is basically on global scope, and depending on how you name your variables, they can get a bit tangled between different scripts and so on.

Then finally, in my opinion, maybe that one of the biggest things here is that Python debugging and development can be a bit annoying experience, and it can be difficult to get into a flow. This is mostly because you cannot reset the Python in JMP or restart the kernel, and you end up in this type of "cycle of the development". You develop in Python, then you want to test it, and you suspect that there's something that you would want to clean from some variables for example.

Only way to do it currently is to close the JMP instance, rerun the JMPs, then load everything back, and then you can test the code. This is not a cycle, it basically breaks here, and you have to restart the whole thing again from start. It breaks the flow. That cause a frustration, you can mitigate this a little bit, depending on how you write your Python code.

Then to our application, so what is JMPyFacade? First, the name comes from the way it facades basically the complicated JMP and Python code behind fairly similar, fairly familiar JMP-like user interface. It is version control added, which gives JMP users access to Python capabilities that have been predetermined. We call these services. These range from very simple dataset creations to much more complicated, [inaudible 00:04:34] learning models.

What we want to do with this is to give access to our JMP users to utilize Python capabilities without them basically knowing that they are using Python at all. We did have something a bit similar earlier, but in that case we utilized server to bypass JMP's Python integration, because earlier it was a bit difficult to use. Now with JMP18, we can use the native integration, and it works pretty well.

Then flow of the application when user runs it, they run the add-in, then it will check if it's updated or not, if it's not updated, it will get update and user will rerun the add-in. If there're no updates to be found, it will init the LS Python in JMP. Purpose of this step is to basically create the update of orders. If user hasn't run it, you could run in something… an application like this into some problems if they do not exist.

After we have those initialized, we're going to all check from our packet management if there are packages that should be updated, if we update them if necessary, and if nothing to update, then we just continue. We check if user has a table open, and if there's a no table, we will close the add-in and inform user that check, now open a table before running this. This is done currently because we only support one single static table, basically.

Then, when user has a table open, they want to analyze, but then we import the packages. If there's an issue, again, it's a stop and then showing log, what went wrong. If it's successful, we initialize the user interface, create the user interface, and users can start using that tool. If the user interface does already exist in the JMP instance, we will just bring it to front and we do not really initialize anything.

A little bit how it works in background, we are utilizing data types, JMP classes and then on Python side, also Pydantic package to create the models we use, and for type validation, and data validation.

Here we have an example of HDBSCAN, and if we take for example this features argument here, it's a list of strings, it's a field which has minimum length of one and utilizes JSON Schema, which are numerical columns. When JMP gets this, it knows to create this type of class, so you'll be at a button, column list box, user has to input single numeric, at least a single numeric column here for this to work, and different type of element for different types of data integers, number 8 boxes, and then we have Boolean which are checkboxes and so on.

When users ready to run the service, they fill in the columns, click the run service button, and at this point, the JMP side of the code will create a subset of the data which are these columns and take the columns which are not excluded. These are then used when the Python function is called. Makes it a bit easier to manage on the Python side, that's why we do it like this.

Python will run its algorithm, and it will update the values to the subset. When the subset and the Python returns some values, JMP knows, "Okay, now we can update the original table." It will update the table, close the subset, and then also create some table scripts and column properties to the original table. Every time user runs basically a service, it will create new columns to the table if there are columns to be created.

Then a few examples of the usage, this is basic example, so user runs that, add-in, it checks from the version control, and in this case, user wants to run tSNE from this add-in, so they select the tSNE from the left side, input the columns, and they also check from the documentation, this is the official documentation from the tSNE if they for example want to know how it works, what it does, and what the parameters are. When the user is ready to run the service, they click the run service button, and new columns are created to JMP table, and now they just work with those like the normal button JMP.

Then we also view the report script, in this case it's just the input parameters. Here a bit more complicated examples, or excessive examples, or expansive examples. In this demonstration, it's an interactive mode. In my opinion, one of the best features we have here. These do not create always new columns, they update static columns, and they immediately rerun their algorithms, Python algorithms when user changes any values in the user interface. For example, you can use this export data analysis or fine-tuning algorithms.

Here, user basically clicks up arrow on the keyboard, and it will rerun the Python algorithm all the time, and the clustering will change based on that user, when users update the values. More views here, then the components change, so you will get totally different clusters depending on the values you have. You could modify values here interactively, figure out what are the best in this case, and then maybe go back to Python and use these there for example.

Then we have this feature importance service that utilizes multiple algorithms, and then it uses Shapley values to determine which could be the most important for modeling or predicting something. We run the models, we get Shapley [inaudible 00:10:32] back, and then also ranking of our selected algorithms. In this case, RGBM, XGBoost, and AdaBoost. And from here, a user could basically copy-paste values from here and then start modeling in JMP using these columns.

Then finally, some future work for this tool, adding support for the dynamic table, now we have the one static table, more services, image analysis and our own custom machine learning models. Improves the interactivity, so Python would basically create graphic scripts or update graphic scripts directly. Exporting and importing Python models. Utilizing JMP's great DOE capabilities for parameter fine-tuning. Creating pipelines, so you could run multiple services in a sequence. Improve the dynamic features we have here, and then depending on what JMP19 brings some possible refactoring, so we can further improve the tool. Then some things to discuss maybe is open sourcing this, so there could be more than just us to developing this tool. Thank you.



Event has ended
You can no longer attend this event.

Start:
Thu, Mar 13, 2025 06:00 AM EDT
End:
Thu, Mar 13, 2025 06:40 AM EDT
Ballroom Gallery- Ped 6
Labels (1)
0 Kudos