cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
] />

Discussions

Solve problems, and share tips and tricks with other JMP users.
Choose Language Hide Translation Bar
hogi
Level XIII

Compress (identical) images in JMP Data Tables

JMP has the ability to store images in data column.

Even storing the same image 1000x is super fast - and no issue for the memory of the PC.
But storing the file on the hard drive takes a while - and a lot of space!

Activating the GZ compression doesn't help:

compress.png

 

Try it out - why does the GZ compression not work

dt = new table("test", add rows(1000),Compress File When Saved( 1 ), new column ("img", Expression));
myIMg= New Image(open("https://apod.nasa.gov/apod/image/2603/NGC1300-LRGB_1024.jpg"));
:img[1] = myImg;

// add the image to row 1
dt << save("$temp\test1.jmp");
wait(0);
Show(File Size( "$temp\test1.jmp" ));


// add the image to every row
for each row(
	:img=myImg
);
wait(0);

lastImg= new image(:img[1000]);
last image << scale(0.2);

ex = New Window( "Modal Dialog example",
	Modal,
	V List Box(
		Text Box( "Wow, very fast! this is the image in row 1000:" ),
		Picture Box (last image),
		Text Box( "do you have enough time/space to save the data table?" ),
		H list Box (Button Box ("OK",bb<< close window()),bb =  Button Box("cancel", bb<< close window();stop();)),
		
		
	)
);

dt << save("$temp\test2.jmp");
wait(0);
Show(File Size( "$temp\test2.jmp" ));
9 REPLIES 9
hogi
Level XIII

Re: Compress (identical) images in JMP Data Tables

Before the files is save, why is JMP so fast?
For the user it seems that in every row an image is saved:

image.png

But JMP just stores a reference to the original image. Changing the size reveals the link:

names default to here(1);
dt = new table("test", add rows(10),Compress File When Saved( 1 ), new column ("img", Expression));
myIMg= New Image(open("https://apod.nasa.gov/apod/image/2603/NGC1300-LRGB_1024.jpg"));
:img[1] = myImg;

// add the image to every row - as a "reference"
for each row(
	:img=myImg
);
wait(0);

// add the image to row 4 - as a new image
:img[4] = new image(myImg);

// change on e of the images
image2 = :img[2];
image2 << scale(0.5);

// ... changes all the images - because all of them are links to the same image
Show(image2 << get size);
Show(myImg << get size);
Show(:img[3] << get size);

// all of them? not row 4
Show(:img[4] << get size);

result.png

hogi
Level XIII

Re: Compress (identical) images in JMP Data Tables

The same trick applies when tables with images are stacked.
Be brave - you can stack thousands of columns and get thousands of duplicated images - the are clones and don't consume memory.
But don't dare to save the stacked table!

Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );
dtstack = dt << Stack(
	columns( :height, :weight )
);

:picture[1] << scale(5);
Show(:picture[1] << get size);
Show(:picture[2] << get size);
hogi
Level XIII

Re: Compress (identical) images in JMP Data Tables

... and consider the collateral damages.
They are less obvious than you might think.


Names Default to Here(1);
dt = Open( "$SAMPLE_DATA/Big Class Families.jmp" );
Show(dt:picture[1] << get size);

dt sub = dt<< Subset( All rows, Selected columns only( 0 ) ); // non-linked subset
dt stack = dt << Stack(
	columns( :height, :weight )
);


dt stack:picture[1] << scale(0.1);
Show(dt stack:picture[1] << get size);
Show(dt stack:picture[2] << get size);

Show(dt:picture[1] << get size);
Show(dt sub:picture[1] << get size);

result.png

Re: Compress (identical) images in JMP Data Tables

A jpg is already an optimized compression.  Often compressing random, compressed or encrypted data can make the data grow not shrink as the compression finds little to compress and you are adding headers and housekeeping data around the compressed data.  Yes, the data table is 'saving' the binary image as a base64 blob in a string field in the column, but it still looks like mostly random data to the gzip compression.  

I would say the real question, probably one to pass to JMP Tech support, ( and hence to the data table developer ), is why isn't single copy and references being saved here instead of identical image copies.  I am assuming as well that if you load the saved table, there will be memory impact since they are now probably distinct.  If the data table knows it's identical in memory, why isn't it maintaining that on save.

hogi
Level XIII

Re: Compress (identical) images in JMP Data Tables

[00296024]


-> "
it is not surprising that JMP does not compress a data table in a way that maximizes efficiency for storing images.  It seems that everything is working correctly. "

hogi
Level XIII

Re: Compress (identical) images in JMP Data Tables

Indeed, the easiest option would be: JMP recognises that the images are identical and uses this information when storing the file.
It seems that, in memory, JMP uses something like a 'pointer' and multiple pointers point to the same image. -> coll when the table is generated.
When JMP stores the file, it uses the pointers and is happy with the identical images, storing them one by one.


However, it was still surprising for me that the compression algorithm for the column does not identify the duplicates and stores them efficiently.

My AI suggest that (maybe due to reasons of backward compatibility?) JMP uses an old compression mechanism with a small deflate window. (JMP mentions gz in the preferences). In that case, the first image will be outside the deflate window when the second image is parsed.

AI suggestion: 
Switching to a more modern compression scheme (zstd / lzma?) could lead not only to a significant improvement in the compression of multiple identical images, but also to better compression of other column types.

lwx228
Level VIII

Re: Compress (identical) images in JMP Data Tables

Excel: Relational container (deduplicates images via pointers).
JMP: Flat data stream (duplicates image binary per row).

— Stated by Gemini 3.1 Pro

Re: Compress (identical) images in JMP Data Tables

The developer is aware of the issue, and indicated to me that he is looking into it.  Images were not originally supported in JMP data tables, they are a more recent addition (last 10yrs or so). I believe when image support was added, it was done this way so changing the .jmp file format was unnecessary, and backward compatible. To older versions of JMP it would show up as the string data. It also allows building a table with images from scripting.  Fixing this is likely to require an update to the .jmp file format.  Possibly storing the image internally like categorical or list check type of data, and preferably a binary blob vs base64 conversion.

It's great you are rooting out these issues for us to fix. Ultimately we do want performance and great memory usage in JMP.  

Data is only getting bigger.

hogi
Level XIII

Re: Compress (identical) images in JMP Data Tables

Sounds great.
Thank you for the open discussion and the insights :)

Our shared goal is to make JMP as good as it can possibly be—robust, well-rounded, and refined.
It should continue to be the best analysis tool in the world. ... And with the rapid advent of AI, the competition will get tougher!

Recommended Articles