JMP Integration with AWS

We are pleased to present how JMP plays a major part in our data environment at Soitec. With more than 600 users at our company, it is always a challenge to enhance the user experience and industrialize our products deployed on JMP.

Today, JMP allows many data sources to be targeted but collecting data from AWS services can still be overly complex.

Starting with JMP 15 and thanks to Python integration, we were able extend the capabilities of JMP further. We have internally validated a set of custom libraries (including S3, Athena, and RDS) that allows us to target the AWS services needed for end users. Today, with JMP 18's embedded integration with Python, it's much easier to manage the architecture.

In this presentation, we explain a bit more about the role that JMP plays in our data engineering pipeline and how we can benefit even more from the Python integration with JMP 18.

Hello everyone. Today we are very pleased to be here presenting you our poster. The poster will talk about JMP integration with AWS services. I am Yacine Belamiri and I work as a Data Engineer at Soitec.

I'm Guillaume Bugnon. I work as Data Engineer at Soitec too. Today we're going to introduce you to the poster. JMP allow many data sources to be targeted with native format as Excel, JSON and even PDF file. JMP can also connect to a lot of databases using ODBC native connection, but this is sometimes not enough. Why? Because today the cloud and the cloud services like Azure, AWS or Google are more and more used in our industry and in all industry and are now part of the company data architecture. In some cases JMP user need to access today's data and in some cases, we also have to develop some GSL-specific connectors.

What are the objectives of this poster? First thing is to tell you that starting JMP 15 used Python integration to extend the capabilities of JMP by deploying both JMP and Python on users' PCs. What this has allowed us? It allowed us to create JMP libraries which is basically a set of GSL function dedicated to AWS services or any other cases that are not handled natively within JMP. Just to let you know we have more than 600 users, and we need to manage such deployments so that we can make it easy for everyone to use this connection and also to ensure that our developers all use the same code.

Starting JMP 18, JMP makes it more easier to manage the Python integration and remove the dependencies that we had before making it quicker and easier to install JMP and just use to install JMP with Python and use it natively.

Why we do this is because it enhances user experience, it simplifies for them the access to the data. We can prepare the data in the first place, store them and validate them at the corporate level. Another thing is that today at Soitec we have validated JMP as a major step in our data engineering pipeline, so when we need to expose data for statistical purposes, we do it with JMP.

How we've done this. This is just a quick diagram about what we do. In the first place we created common libraries on Python. These libraries just connect, for example, to AWS, can have some statistical features that are not included within JMP or just graphs that we want to include that we found them sexier, funnier or something like that. This is the first step.

Then what we do, we will call these Python libraries from our JSL libraries. This JSL that we want our JSL developers to use afterward. This is validated internally, this is validated by us the IT, and then we can all use the same code. Then at the last level, Soitec certifies, I think can call this GSL libraries, and then it provides to the user a really simple method to connect to the data.

If you go more into the detail with simple symbols, we will have this kind of architecture. We have the three different bricks that Yacine just talked about previously. Let's say I'm job developer and I want to connect our extract data which are stored on the S3 buckets on AWS server. What I have to do first, I have to call the library path and the Python library which is JSL, and then I just have to use my function. You can see that it's very easy with four lines of code, I'm able now to access to files which are stored on AWS S3. But to do this there is some cost at the beginning. We still have to develop these two other bricks, the red bricks and the black one.

First, let's say this app, we'll call this Python library and here we just import the Python library. Exactly. This is kind of bricks. We have the green, the black and the blue bricks. Let's have a demo. Sorry. Let's go into JMP. It's right here. It's very simple. Let's put the log. I just have to import my Python GSL library. You can see that in this Python GSL, we have different functions, so two here.

Then I just can say I want to open the packet files. Parquet is a file format that is not native, not spotted now by JMP, but with Python, I can open the packet files so that's okay. I just have to say I want to open the packet file which is storing… Take John's discovery with a sub buckets, the file name and the address region, so I just okay. I need to make the new data view, and it's easier, quicker, and I'm able to get my data, then I can use JMP to make my analysis.

Another way to do is same, but we also are using the same library to get some secret. Secret could be a password or credential for to connect to database, for example. With this function which is in the library too, it's really easy to get this sequence. My secret is dev gem discovery at the respect, and then I can print the secrets, and we can see here what are the secret value.

Another more real use case I would say, we have this one at Soitec, we have our application here that's able to connect to AWS RDS database to get aggregated data. The application is able to connect to AWS S3 to get raw data stored in a packet format.

Thanks to the development of the Python and GSL library we can easily connect to database and S3 buckets, and then we're just using JMP to make all our statistical calculations. It could be SPC charts or capability indicator for CPK and so on. Then we get the aggregated data and raw data. After we are, you will know is able to make some data mining and data analysis.

We have presented what we do to get to our data on S3. How we can conclude is today integrated Python into JMP for the AWS access offer numerous benefits. It enhances both functionality and efficiency. Python libraries makes it easier to interact with anything. In our case, we showed AWS, but it can be any other service that is not native with JMP. We did it with S3, with DynamoDB, with Secret Manager that are services within AWS because it's what we use today at Soitec.

The integration that we have starting JMP 18 it reduces the complexity, save us time because in our case we had to deploy ourselves the Python libraries, the Python installation and so today Python enables us to do everything, ensuring that we can access every data and will make it easy for our end user to focus on the analysis and not the data collection. Together Python and JMP create a powerful synergy for efficient data decision-making.

In the upcoming JMP releases, we have the introduction of jmputils Python library. This will provide really a new way to interact between JMP and Python, and we are very thrilled to know how we are going to use it and how it can offer us an entirely enhanced experience to run everything in Python and prepare it for JMP afterward. We are waiting for this release to be fair with you.

Thank you. That was all for us. You can see a brief presentation of our company, what we do and what is the core of our business.

Thank you.

Presented At Discovery Summit Europe 2025

Presenters

Skill level

Advanced

Beginner
Intermediate
Advanced

JMP Integration with AWS

Presenters

Skill level

Files

Data Access

Data Blending and Cleanup