cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
keith_a_kraft
Level II

Best method to get a python object into JMP

I have some python code that can generate about 3 gb of data over 15 minutes.  It is currently streaming the data to a text file as a "column" value1|value2|value3.. line and then does some post processing to convert the whole text file into a csv that jmp can open.  The csv file can have up to ~30k columns and 100 to 10k rows after it splits the values.

To make the data get to JMP faster, I would like to stream the text data directly from the memory of the python session into a jmp session.

I have tried some ole automation using pycom but it seems that the calls to pycom are very slow which makes it hard to iterate on each line and update the correct row for the couple of values that get produced in that "line".

Is there another way? sqlite memory table? JSL socket?  JMP python API?

1 ACCEPTED SOLUTION

Accepted Solutions
vince_faller
Super User (Alumni)

Re: Best method to get a python object into JMP

Craige@JMP

I was having an issue with Run Program that seemed to do with the options and file name I use for Run Program.  If I use pick file(), it crashes for me.  Or if I hard code a directory with a space in it. 

 

Do you have any suggestions to make this dynamic?  Using JMP 12.0.1.

 

Example

 

 

f = pick file();
 
 
//doesn't work
x = runprogram(executable("C:\Python34\python.exe"),
options(f),
readfunction("blob")
);
 
//doesn't work
x = runprogram(executable("C:\Python34\python.exe"),
options("/C:/Users/User/Desktop/generate.py"),
readfunction("blob")
);
 
//doesn't work
x = runprogram(executable("C:\Python34\python.exe"),
options("'C:/Google Drive/Work/Scripting/generate.py'"),
readfunction("blob")
);
 
//works
x = runprogram(executable("C:\Python34\python.exe"),
options("C:/Users/User/Desktop/generate.py"),
readfunction("blob")
);

 

Vince Faller - Predictum

View solution in original post

6 REPLIES 6
Craige_Hales
Super User

Re: Best method to get a python object into JMP

I think you could move IEEE floating point data or integers between python and JMP using 7.1. struct — Interpret bytes as packed binary data — Python 3.5.0 documentation on the python end and blobToMatrix on the JMP end.  No conversion between binary and character and back to binary will speed it up, a lot. Use a file or a socket between them.  Are they on the same machine? that will make it a little easier.  Not the same machine? might not be IEEE floating point...  Character data would need different handling, but should be easier than numeric.

Craige
Craige_Hales
Super User

Re: Best method to get a python object into JMP

Here's a complete example that runs in about 2 seconds for 1,000,000 doubles:

the python code builds an array of doubles, makes a binary string of bytes, and writes the bytes to stdout.  You could use a file if you like, just make sure it is opened in binary mode (wb).  The ">" at the beginning of the pack format means "big endian".

"generate.py"

import array

import struct

import math

import sys

# windows python messes up binary newlines...unless...

if sys.platform == "win32":

    import os, msvcrt

    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)

bigmat = array.array("d")

for x in range(0,1000000):

  bigmat.append(math.sqrt(x))

# %s is replaced by "1000000" ... d is for double precision 8 bytes each

binary = struct.pack(">%sd" % len(bigmat), *bigmat)

# print len(binary) # 8000000 bytes

sys.stdout.write(binary)

the JSL uses runProgram to run the python program, and reads a blob back from the stdout.  BlobToMatrix specifies "big" to match the big endian data.  You could use LoadTextFile( ...BLOB ) instead of runProgram if you make the python program run separately and write a file.  You'd still need blobToMatrix.

"fetch.jsl"

x = runprogram(executable("C:\Python27\python.exe"),

        options("C:\Users\User\Desktop\pythonExample\generate.py"),

        readfunction("blob"));

xx = blobToMatrix(x,"float",8,"big");

// verify...

ok="good";

for(i=0,i<nrows(xx),i++,

  if( xx[i+1] != sqrt(i), ok="bad")

);

show(ok);

ok = "good";


http://stackoverflow.com/questions/2374427/python-2-x-write-binary-output-to-stdout ​ had the answer for the binary newline issue.

Craige
vince_faller
Super User (Alumni)

Re: Best method to get a python object into JMP

Craige@JMP

I was having an issue with Run Program that seemed to do with the options and file name I use for Run Program.  If I use pick file(), it crashes for me.  Or if I hard code a directory with a space in it. 

 

Do you have any suggestions to make this dynamic?  Using JMP 12.0.1.

 

Example

 

 

f = pick file();
 
 
//doesn't work
x = runprogram(executable("C:\Python34\python.exe"),
options(f),
readfunction("blob")
);
 
//doesn't work
x = runprogram(executable("C:\Python34\python.exe"),
options("/C:/Users/User/Desktop/generate.py"),
readfunction("blob")
);
 
//doesn't work
x = runprogram(executable("C:\Python34\python.exe"),
options("'C:/Google Drive/Work/Scripting/generate.py'"),
readfunction("blob")
);
 
//works
x = runprogram(executable("C:\Python34\python.exe"),
options("C:/Users/User/Desktop/generate.py"),
readfunction("blob")
);

 

Vince Faller - Predictum
ms
Super User (Alumni) ms
Super User (Alumni)

Re: Best method to get a python object into JMP

Very cool way to use Run Program

 

I don't have the solution but this works in JMP 12.1 on Mac

 

 

f = Pick File();
x = RunProgram(
    executable("/usr/bin/python"),
    options(f),
    readfunction("blob")
);

 

 
Craige_Hales
Super User

Re: Best method to get a python object into JMP

Thanks for looking at it, and good question.

I think you are fighting the windows command line behavior with embedded blanks in file names.  In a DOS box, you'd use quotation marks like this:

quotes.png

the escaping gets a little ugly, but here it is in JSL:

runprogram(executable("cmd.exe"),

  options({"/C","dir \!"c:\Users\chales\Saved Games\!""}),

  readfunction("text"))

I don't have python here at the office, but I anticipate it will have the same behavior.  It gets worse if you need to embed quotation marks; another project I'm working on used \" as described in one of the answers in http://stackoverflow.com/questions/7760545/cmd-escape-double-quotes-in-parameter

If you get to choose the path names for your projects, leaving out the blanks will make it easier.  If you have a file picker, you need to allow for them.

Craige
vt_sailor
Level II

Re: Best method to get a python object into JMP

Craig,

Not much help, but it also works this way in JMP 13.  Have to ask JMP support...Maybe it will get fixed in JMP 14