cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar

Fast text file format for importing to JMP

I produce large delimited data files (typically csv, or tsv) which will ultimately be imported into JMP by users. Typically 1-3GB in size. I can specify the format of the files and would like to get some opinion on which is the best non JMP format to use and are there any hacks for speeding up the importation of large text based data files. 

 

Cheers, Troy

 

 

 

5 REPLIES 5
hogi
Level XII

Re: Fast text file format for importing to JMP

Depending on the settings in the preferences, it's possible to import txt import - with point and comma .
"not possible" means: data can be imported but numbers will be interpreted as characters - with time and memory penalty:

JMP can handle numbers much more efficiently than text. The bottleneck to import them as strings - an then convert them to numbers could be too extreme to get your 1-3GB into the computer memory.

 

With JMP 17(?), compact columns were introduced. This allows to store character columns (with repeating entries) much more efficient in the computer memory. But there could be a bottleneck to get the data from a csv file into the compact column.

 

If you want to load numeric data in a "grid", please have a look at hdf5 data format.
JMP can import such files, but the functionality is restricted. Better load the data via Python.

hogi
Level XII

Re: Fast text file format for importing to JMP

If there is time/date data in the input file, there are some restrictions: 
CSV import force MDY or DMY date format 

 

 

jthi
Super User

Re: Fast text file format for importing to JMP

Have you already tried something as you have the files available?

 

  • Which options have you tried to open the files?
    • JMP "ways" of opening file: Open / Multiple File Import
    • Python Integration in JMP18 (DuckDB / pandas / polars and so on)
  • Did you run into some performance issues? 
  • Which file formats have you tried and which you can create?
    • Usually .csv is pretty good file type as it is easy to create and generally easy to open
    • There are other filetypes, but they can be more difficult to create and have their pros/cons such as Parquet, Pickle or Feather (you will have to use JMP's Python integration to load these into JMP).
    • Could database be an option? sqlite would be most likely the simplest option and JMP can open it easily
  • How many files should user be open at the same time?
  • How many users are there?
  • Will all users open the same files? Could you open the with some automated process and create a jmp table which users would then use or store the results to database instead of text files?
-Jarmo

Re: Fast text file format for importing to JMP

Thanks everyone for the replies. 

 

The files are simply pre-prepared datasets with typically 10K+ Columns....and the column names and types differ from file to file so it is difficult to identify the column types ahead of time and specify it in preferences.  They are available to 100s of people but usually someone works on just one file at a time. JMP can handle the data size (3GB is rare, most files are around 1.2GB - 1.5GB)....it just takes time to open them. Also, they are on a share drive......but people usually copy them locally before opening.   

 

I think the way forward is to convert them to JMP Format when I make the datasets and save them as compressed JMP files in addition to CSV. I've seen a ~4x compression when I tested this yesterday. Some users consume these files using Jupyter Notebooks or directly in other analysis platforms so I'll keep the CSV Versions available for those folks also......I'll just need some extra space on the net-app filer. 

 

Thanks all again for the input.  

jthi
Super User

Re: Fast text file format for importing to JMP

JMP's compression can definitely help JMP users, especially if they are stored in network. This is something we did for our JMP users at some point, converted .csv files found from specific network drive location into JMP table(s). Nowadays we try (definitely not always possible or easy) to store such files to database, though we usually do not have that many columns just lots of rows.

 

There is also an option of creating a separate add-in for JMP users to use those files. It still won't overcome the issue with file size though BUT the tool could first copy them as local files and then load them locally.

-Jarmo