cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Check out the JMP® Marketplace featured Capability Explorer add-in
Choose Language Hide Translation Bar
Kwangki
Level III

How to convert Python Pandas dataFrame to jmp format without saving it as files?

I know JMP support "Python Interface" function within JMP script envirornment. But somtimes it will be better to do co-work independently Python engineer and JMP one.

 

In our case, Python engineer read database files with DB interface, and makes pre-process with Pandas library, and wants to convert JMP format because JMP enginner need only JMP format data in order to do data analysis and post-process with JSL.

 

As you imagine, the bottleneck is how to convert Python Pandas dataFrame without saving it as files. The easiest way is to save files and read it as jmp format but it takes many times to save and read them. I'd like to know where there is any library to convert Python Pandas dataFrame to jmp format directly in memory or not.

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

One method for doing this would be to convert the  Python Pandas dataFrame to a JSL version of a JMP table.  This is a fully supported and documented form of a JMP data table.  And it will allow you to include any of the Columm Properties and row states etc.  To see examples of what you would need to do, all you have to do is to point to any existing JMP data table and ask JMP to generate the script that will recreate the table.

Jim

View solution in original post

11 REPLIES 11
Byron_JMP
Staff

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

if the files are small and fit on the clipboard, then you could just save the data frame to the clipboard and paste it from the clipboard into JMP.

 

This is an example from a Discovery Summit Talk:

https://community.jmp.com/t5/Discovery-Summit-2017/Leveraging-Python-and-REST-to-Introduce-SAS-Viya-...

 

names default to here(1);
path=Get Current Directory();
dt1=Open("$SAMPLE_DATA/Big Class.jmp");
current data table(dt1);
dt1<< Bring Window To Front;
main menu("Copy With Column Names");

text="\[
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

df=pd.read_clipboard(sep='\\s+')
size= (df.height * df.weight)/df.age ** .5
ax = plt.subplot(111, projection='polar')
c = plt.scatter(df.height, df.weight, c=df.age, s=size, cmap=plt.cm.hsv)
c.set_alpha(0.55)#transparency
plt.show()
]\";

path=Get Current Directory();
save text file(path||"graph.py", text);

x = RunProgram(
    executable( "/Users/user1/anaconda/bin/python" ),
    options(path||"graph.py")
    ,readfunction("blob")
 
);
close(dt1, nosave);//this executes as soon as the graph window is closed

 

JMP Systems Engineer, Health and Life Sciences (Pharma)
Kwangki
Level III

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

Thank you for your advice.

 

But if possible, I'd like to know how to tranfer directly into jmp format which JSL(JMP Script Language) recognizes dataFrame data within JSL envirornment. It means that JSL recognize Pandas dataFrame as like JMP one. The reason is that ,in my case, I can do almost all of JSL automation work if I can start it from JMP data format.

Byron_JMP
Staff

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

In jmp 14 TThere are some new funciotns for getting and sending things with Python. 

I haven't been able to use them yet.

There is a function called Python Get(name) that might work.

JMP Systems Engineer, Health and Life Sciences (Pharma)
taless
Level I

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

Can I ask for details of converting Python Pandas dataFrame to jmp format with saving it as files?

I wish to mine data as Pandas dataFrames and save them as a format that JMP can read easily for further investigations.

Thanks in advance

txnelson
Super User

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

One method for doing this would be to convert the  Python Pandas dataFrame to a JSL version of a JMP table.  This is a fully supported and documented form of a JMP data table.  And it will allow you to include any of the Columm Properties and row states etc.  To see examples of what you would need to do, all you have to do is to point to any existing JMP data table and ask JMP to generate the script that will recreate the table.

Jim
GoodwinJMP
Level I

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

Hello!  I know this is a 4 year old post now,  and I am new users in the JMP community... but for our use cases,  the solution presented here to the original post is not comprehensive for our needs.

 

We would like to have a server running Python only (not JMP / JSL) be able to do all data pulling and preparations / transformations,  then be able to serialize the data (without a local file write) and send the data back to a client in a form directly readable in JMP.   

 

Our data is larger... may be say,  100k to 1M rows... and 10 to 100 columns... It is heterogenous (numeric and categorical) and may contain missing values.  

 

Do any of the DataFrame.to_() methods support this?  Those pandas method can be written and IO buffer rather than a file to save time, and then network transfer to the JMP client?    

 

The leading candidates formats that we would want to work are the .to_parquet(), to_hdf(),  to_feather() .  Looks like pandas has a to_stata() method that writes to the "Stata" format.  Not sure of the JSL (SAS) is something that is open enough where Pandas could implement it someday (may be low on the list there).  Parquet seems to be universal these days,  and pandas also has Arrow "Table" type as a first class type as well,  so that is intriguing to us as well. 

 

We would want to avoid formats that require re-parsing data and assignment of data types (numeric,  categorical,  date_time... ... so .to_csv() and to_json() or other text formats are not really useful.

 

Thanks in advance for reading through the post,  and let me know if I should have started a new thread,  rather than replying to this one.

 

Randall

 

Byron_JMP
Staff

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

I guess you could use JMP's "open Database" command or Query Builder to pull data directly from the data base. You would need an ODBC driver for the data base. After connecting to the data base you could submit a query in SQL to retrieve your data in JMP.  Query Builder would let you graphically  write the SQL.

 

CSV is a pretty common type of data transport file, so I wouldn't abandon that so quickly. Also JMP multithread reads text and csv files so they open fairly quickly.  

JMP Systems Engineer, Health and Life Sciences (Pharma)
Craige_Hales
Super User

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

To_Json with Orient=table might be a good choice. It does need to convert from binary to text and back again. JMP's JSON wizard should handle it fairly directly. Using CSV with the multiple file import might actually be fast enough. It will use all of your CPU processors to load the columns in parallel when JMP is importing them. It might be too slow on the server end depending how it's implemented.
I did roll my own binary to binary transfer in the Beowulf, Newton, and Mr Hanson  blog post, sometime back. It used numpy to output a block of binary data from Python code, and blob to matrix function in JSL to load the binary data back in. That worked well because the data was all numeric.

Craige
Byron_JMP
Staff

Re: How to convert Python Pandas dataFrame to jmp format without saving it as files?

@Craige_Hales  you-da man Craig : )

JMP Systems Engineer, Health and Life Sciences (Pharma)