cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Rans
Level II

import csv with two rows as header

Hi, 

I'm working with relative large csv files (±150MB), The files start with two rows that need to be combined as the column header. 

I wrote a script that loop through each column and set the column name to the current name + the value of the first row. 

 

The problem it is running very slow, sometimes I need to wait ±10min for changing the name of 4000 columns. 

It is somehow depends on the file size, changing the same amount of column in a small file takes few seconds. 

 

I tried few workarounds - 

1. subset only the first row and changing all column names (very fast). I then thought I'll concatenate the original table, but I can't do it because the column names now are not same. 

2. updating the CSV import setting as suggested here. but I still don't see an option to include two rows as header.

3. export the file as xls and then import it with two rows as header. the export time is longer than the original column name change. 

 

Is there another workaround that I'm missing, maybe there is an option to change all column names based on a list?

 

Thanks in advance for the helpers.

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: import csv with two rows as header

You might try multiple file import, even for just one file. Might be faster, and supports multiple line headers.

File->Import Multiple FilesFile->Import Multiple Files

 

Craige

View solution in original post

2 REPLIES 2
SDF1
Super User

Re: import csv with two rows as header

Hi @Rans ,

 

  If you open the file with JMP and deselct the box "Use default program to open. Uncheck to open as text." and then select the radio button "Data (Using Preview)"; be sure to select all files (*.*) in the JMP open window.

DiedrichSchmidt_0-1612445487639.png

You can then select some of the important delimiters and other things, including which row the data starts, so if there are two rows as headers, you would skip one and then set the data starts to line 3 or whatever it is.

DiedrichSchmidt_1-1612445585367.png

You can then open up the "Source" script in the data table and edit the Import settings as needed and use this script in your JSL program.

DiedrichSchmidt_2-1612445652532.png

Alternatively, if you can re-save it as an XLSX file, then you can use the Open() command and define which tab to open, which column to start/stop with, and which row to start/stop with as well as having multiple rows as header. That should solve your problem of having to re-name the columns.

DiedrichSchmidt_3-1612445814991.png

 

 The last option I'd mention is that if you know what the column names and they don't change, then in your script, you can have a list ColNames ={"...", "...",...} with all the correct names in quotations separated by commas in the list. Then, after loading the CSV file, you can run a For Loop to rename each column according to the entries in ColNames. For example:

dt=Current data table();//or whatever your data table name is.
icols=.;//variable to count through the columns
For(icols=1, icols<=N Columns(dt), icols++,
Column(dt, icols) <<Set Name(ColNames[icols]) //this sets the column name to the "icols-th" entry of the list ColNames
);

Hope one of these helps!

 

Good luck,

DS

Craige_Hales
Super User

Re: import csv with two rows as header

You might try multiple file import, even for just one file. Might be faster, and supports multiple line headers.

File->Import Multiple FilesFile->Import Multiple Files

 

Craige