Since you are reading this, you likely know that Python is a highly utilized open-source tool deployed by scientists and engineers – some more eagerly than others, I might add – to address various problems.
So far, my conversations with JMP® users suggest that most of them develop Python workflows to address upstream tasks of accessing data sources, rather than using Python for, say, the machine learning tasks for which it is typically associated. Considering the dizzying array of unique file types from an avalanche of different instruments in any given industry, Python reigns as a go-to method for accessing data because – thankfully! – someone, somewhere has likely built a framework to coerce the files into giving up their secrets.
My initial interest in JMP's Python integration was also similarly geared toward data access. Prior to joining JMP, I worked in the cardiology department of a local hospital. When JMP first released the Functional Data Explorer (FDE) platform, ECG (aka EKG) data from heartbeats seemed a perfect fit for analysis in the new tool: with no traditional top-down model of their form, a bottom-up approach of capturing their common shape variability (i.e. within-ECG lead, the 'beat' you see) across patients seemed a perfect task for FDE.
One teeny snag: all the examples I could find of ECG data utilize WFDB (for “Waveform Database”) file standards involving binary formats which JMP does not natively open. Thankfully, the refurbished JMP and Python integration now supplements JMP’s menu with access to all kinds of file types, for which I again thank the generosity of problem solvers creating Python packages fit to task.
After identifying ECG data and the wfdb Python package from PhysioNet.org, the new tools built into the JMP and Python integration allow me to:
- Pull the ECG data from the WFDB file structures via Python.
- Create JMP data tables and populate them via Python.
- Hand off the remaining efforts to JMP via running a JMP Scripting Language (JSL) file of steps created using JMP Workflow Builder – which captures the code of my JMP point-and-click actions – for data prep, visualization, and FDE analysis.
JMP’s “Python-in-a-box” approach includes a built-in Python editor as well as pre-installed packages (jmp and jmputils) with which I can:
- Install other Python packages required for the task using the jmputils jpip command (paralleling Python’s pip),
- Then use the normal Python script approach to import these packages.
Installing and importing the wfdb package helped clear the first hurdle of gaining access to the data in JMP.
################################################################
# let's get this party started - load up some Python packages
################################################################
import jmp
import jmputils
jmputils.jpip('install', 'numpy pandas')
jmputils.jpip('install', 'wfdb')
import numpy as np
import pandas as pd
import wfdb
import os
If it looks like Python, that's because it is! All within JMP.
From there, in addition to some Python data prep steps, the remaining hurdles were cleared with a few more new JMP and Python integration items:
- Creating the JMP data table using the Python jmp.DataTable() function.
- Adding columns to the JMP data table via new_column() before passing the data list.
# migrate the signal data to a **JMP table**
# 'ecgall' is the pandas DataFrame containing 12 columns of ECG lead data
ecgdtname = "ECG Data"
ecgdt = jmp.DataTable(ecgdtname, rg*signal.shape[1]) #2nd argument assigns # of rows
for col in ecgall.columns:
#creating new column
ecgdt.new_column( col, jmp.DataType.Numeric )
#adding data
ecgdt[col] = ecgall[col].tolist()
I’ve only shown one way to create and populate the new JMP table using Python, but a variety of approaches exist, including validating data types before populating or simply using brute force.
Finally, jmp.run_jsl() provides a handy way to have Python run some JSL for you. To keep my sanity, I’ve kept "the hot side hot and the cold side cold" by putting the JSL workflow, derived from the Workflow Builder recording my actions, in another file entirely and used the JSL include() function to reference and run its contents.
jmp.run_jsl('''
include("PTB-XL Post Python Workflow.jsl");
''')
I could have taken the other route and used Python Submit() or Python Submit File() to run everything from JSL, but that would make explaining things here from the Python side a bit murkier. Check out the Scripting Index in JMP under Help for more info.
With the new JMP and Python integration tools, the only bottleneck in realizing my initial goal of just seeing the data was my lack of Python experience (I'd only dabbled in it just prior to joining JMP several years ago); using the Python tools for JMP was the easy part. In just one short afternoon, I was able to bang through all my Python coding errors to access the data and share the JMP visualization (also see video and attachments below). I might have even reached out directly to the developer (don't tell my boss) to express my gratitude.
I've started working on other data access tools as @Bill_Worley suggested here, and I expect that you will soon see some of those efforts in the JMP Marketplace. Trying to navigate your own data and analytic workflow spanning Python and JMP? If you haven't seen it, @yasmine_hajar has put together excellent resources to help you get acquainted with some of the new tools outlined above. If you have questions, you can explore the JMP User Community, post your question there, reach out to your dedicated JMP team, or drop a line below.
Level up Python with JMP - data access.zip