cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
clausa
Level III

JMP Table & Column Compression

Does anyone have any experience with JMP Table and/or Column compression?

Any concerns/watchouts to using either? I have some JMP tables that are >1GB so would be nice to compress them...

As far as I can tell, there are 2 types of compression built into JMP:

  1. Compress Tables
    1. How to Access:
      1. Open file
      2. Click red arrow next to table name on the left, above where the scripts are located, in the table panel
      3. Click "Compress Table When Saved"
      4. Can also be set as default by "Preferences > General > Save Data Table Columns GZ Compressed"
    2. Purpose: This just basically does a compression ala zip (though I think it is actually GZ) of the whole file so it is smaller on your computer. Same type of effect you would get if you zipped a file, but it remains a .jmp vs a .zip with a .jmp inside of it. I have seen up to a 10x compression here.
  2. Compress Selected Columns
    1. How to Access:
      1. Select columns in file
      2. Click red arrow for columns (or Cols from the top)
      3. Select "Compress Selected Columns"
      4. Can also be auto-enabled
    2. Purpose: Uses List Check and compressed integers where available to make columns/cells actually take less memory. This reduces the file size and speeds up analysis.
1 ACCEPTED SOLUTION

Accepted Solutions
Jeff_Perkinson
Community Manager Community Manager

Re: JMP Table & Column Compression

You've outlined the two options very well.

The main difference is that Compress Tables affects only how big the file is on disk. JMP will use the un-compressed version in memory. So, this option is useful to keep your drive from filling up with JMP data tables.

Compress Selected Columns results in smaller files on disk as well as using less memory. Unfortunately, not every column can benefit here.

Here's what Compress Selected Columns does:

  • It adds a List Check (default order) to character column if the column has less than 255 distinct values.
  • Change numeric columns to the smallest 1-byte, 2-byte, or 4-byte integer if all values in the column can be stored. Only integer values columns are checked.
    • For 1-byte integer, the range of numbers that you can store is from -126 to 127.
    • For 2-byte integer, the range of numbers that you can store is from -32,766 to 32,767.
    • For 4-byte integer, the range of numbers that you can store is from -2,147,483,646 to 2,147,483,647.
  • It will not change the columns if they have list check already.

HTH,

-Jeff

-Jeff

View solution in original post

4 REPLIES 4
Jeff_Perkinson
Community Manager Community Manager

Re: JMP Table & Column Compression

You've outlined the two options very well.

The main difference is that Compress Tables affects only how big the file is on disk. JMP will use the un-compressed version in memory. So, this option is useful to keep your drive from filling up with JMP data tables.

Compress Selected Columns results in smaller files on disk as well as using less memory. Unfortunately, not every column can benefit here.

Here's what Compress Selected Columns does:

  • It adds a List Check (default order) to character column if the column has less than 255 distinct values.
  • Change numeric columns to the smallest 1-byte, 2-byte, or 4-byte integer if all values in the column can be stored. Only integer values columns are checked.
    • For 1-byte integer, the range of numbers that you can store is from -126 to 127.
    • For 2-byte integer, the range of numbers that you can store is from -32,766 to 32,767.
    • For 4-byte integer, the range of numbers that you can store is from -2,147,483,646 to 2,147,483,647.
  • It will not change the columns if they have list check already.

HTH,

-Jeff

-Jeff
bswedlove
Level IV

Re: JMP Table & Column Compression

I run "dt<<compress selected columns();" on tables with hundreds of columns and the command fills up my log with all the changes. Can I run the command but stop it from writing in the log?

Jeff_Perkinson
Community Manager Community Manager

Re: JMP Table & Column Compression

Unfortunately I don't see any way to keep it from writing to the log. I'll enter an enhancement request to see if we can add this in a future release.

 

In the meantime I can only come up with some unsatisfying hacks involving saving the log before and clearing it after the call to compress the columns.

-Jeff
jay_holavarri
Level III

Re: JMP Table & Column Compression

I've been working with files that are larger than I typically deal with and started using the 'Compress File When Saved' option. It really cuts down on the size, but I am always suspicious of free-lunch solutions. Is there really no downside? Why wouldn't this be the default for all tables?

 

If the only downside is a 500 ms lag when opening a table or something, then I'll want to do it all the time. 

 

Thanks!