cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Helal
Level II

Import Multiple CSV Files from different folder, Update Value, and save CSV with different names

I have about 95 csv files with different column names saved under main data folder and 95 different sub-folders like \data\dataset1\dataset1.csv, \data\dataset2\dataset2.csv, etc. id1 and id2 columns are common variables across all files. I want to to do the following:

  • Import all 95 csv files to JMP (using JMP Pro 16)
  • Update a value based on a condition across all 95 files: if (id1 = 'a001' then id2='b001')
  • Save updated csv files under different names in respective folders

First row is the file title and can't be removed. Column/variable names start at row 2 and data on row 3. 

I managed to import all files by:

 

dt: Multiple File Import(
<<Set Folder(
"C:\Data\"
),
<<Set Show Hidden( 0 ),
<<Set Subfolders( 1 ),
<<Set Name Filter( "*.csv" ),
<<Set Name Enable( 1 ),
<<Set Size Filter( {522, 2096152} ),
<<Set Size Enable( 0 ),
<<Set Date Filter( {3715322235, 3718948571} ),
<<Set Date Enable( 0 ),
<<Set Add File Name Column( 0 ),
<<Set Add File Size Column( 0 ),
<<Set Add File Date Column( 0 ),
<<Set Import Mode( "CSVData" ),
<<Set Charset( "Best Guess" ),
<<Set Stack Mode( "Table Per File" ),
<<Set CSV Has Headers( 1 ),
<<Set CSV Allow Numeric( 1 ),
<<Set CSV First Header Line( 2 ),
<<Set CSV Number Of Header Lines( 1 ),
<<Set CSV First Data Line( 3 ),
<<Set CSV EOF Comma( 1 ),
<<Set CSV EOF Tab( 0 ),
<<Set CSV EOF Space( 0 ),
<<Set CSV EOF Spaces( 0 ),
<<Set CSV EOF Other( "" ),
<<Set CSV EOL CRLF( 1 ),
<<Set CSV EOL CR( 1 ),
<<Set CSV EOL LF( 1 ),
<<Set CSV EOL Semicolon( 0 ),
<<Set CSV EOL Other( "" ),
<<Set CSV Quote( "\!"" ),
<<Set CSV Escape( "" ),
<<Set XML Method( "guess" ),
<<Set XML Guess( "huge" ),
<<Set XML Settings( XML Settings() ),
<<Set JSON Method( "guess" ),
<<Set JSON Guess( "huge" ),
<<Set JSON Settings( JSON Settings() ),
<<Set Import Callback( Empty() )
) << Import Data;

 

In this code, there is no formula for where column names and data start.

Then, I tried to update a value by using:

dt: subjid <<
:id1 << Set Formula(
If( :id2 == "a001", :id1 == "b001")
);

result, values prior to "a001" in id1 column are deleted and id1 value of "a001" is not updated. 

I am stock in this phase and can't even think of next stage of saving files under different names in their respective folders. Any help/guidance is greatly appreciated.

 

Thank you

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: Import Multiple CSV Files from different folder, Update Value, and save CSV with different names

Thanks, made a post for it because it seems generally useful and some of it isn't really obvious how to do.

MFI Update CSV 

Craige

View solution in original post

4 REPLIES 4
Craige_Hales
Super User

Re: Import Multiple CSV Files from different folder, Update Value, and save CSV with different names

I'm not sure what your JSL did:

dt: subjid << :id1 << Set Formula( If( :id2 == "a001", :id1 == "b001") )

The << "send" operator should only appear once, and the column you are adding the formula to should be the left-hand-side. Then in the if statement, there should be a condition, followed by a value for the true case, and another value for the false case. Something like this

 

dt << New Column( "likes",
	"character",
	formula( If( 
		Is Missing( name ), // first test
			"unknown", // if first test is true
		name == "KATIE", // second test
			"cats", // if second test is true
			"dogs"  // if second test is false
		) 
	)
);

which is intended to create a new formula column of cats or dogs depending on the value in column name. If name is blank (missing), "unknown".

 

Or, if your goal is to overwrite an existing column(s) based on values of other columns, perhaps an explicit loop would be better than a formula column:

foreachrow(dt,
	if(
		name == "DANNY", // a test with ==
			name = "Dani"; // an assignment with =
			sex = "F";
		, /* else if */ name == "ROBERT",
			name = "Roberta";
			sex = "F";
	);
);

( If Secrets  might help understanding the commas and semicolons. ) That leaves rows unchanged, unless  the name matches.

 

Notice how the two examples are different:

The first example must produce an answer in the if(...) that will be assigned to the formula's column. If the column should be unchanged for that row, the formula's value must be the original value for the row.  Edit: sorry, that's wrong. Just make sure the formula column produces a value for every row.

The second example only updates rows and columns it chooses.

 

Since you plan to write out a CSV file (not save a JMP file), the formula expression you add to a column will not be saved (though the updated values from the formula will be.) I think you might want something more like the second example.

Craige
Helal
Level II

Re: Import Multiple CSV Files from different folder, Update Value, and save CSV with different names

Thank you Craige. As you mentioned, the second example worked well. However, it only worked if I open or import one csv but didn't work when I imported multiple CSVs. Also, will you please guide me to also accomplish the last stage of writing csv back to their respective folders with different filenames?

 

Thank you,

Helal

Craige_Hales
Super User

Re: Import Multiple CSV Files from different folder, Update Value, and save CSV with different names

Thanks, made a post for it because it seems generally useful and some of it isn't really obvious how to do.

MFI Update CSV 

Craige
Helal
Level II

Re: Import Multiple CSV Files from different folder, Update Value, and save CSV with different names

Thank you again. This is awesome.

 

Helal