cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
aliegner1
Level IV

how to cycle a script through multiple csv files

following up on a previous post, I've got a script that stacks and formats an unwieldy csv file.

https://community.jmp.com/t5/Discussions/Help-doing-a-table-transform-and-converting-a-timestamp/td-...

 

My next question for help is how can I get this to then repeat on a group of csv's in a folder, capturing the filename as one of the columns, ideally stacking them as well. aka step1, step2, step3, step4 files.

Stacking a single csv creates a 400k row output. Will I risk crashing if I merge 4 ot 8 or 10 of these csv's?

Maybe extra would be to then go cycle again through the next folder in the parent folder, creating a new combined step file?

 

names default to here(1);

//Prompt to select and open an MS Excel file
dt = Open( Pick File( "Select Excel File", "C:\Users\folder\", {"Excel Files|csv;xlsx;xls", "All Files|*"}, 1, 0 ), Invisible );

// Get the column names from the data table
colNames = dt << get column names;
// Remove the 1st col from the list
remove from(colNames,1,1);

// Stack the data
dtStack = dt << Stack(
	columns(
		colNames
	),
	Source Label Column( "Label" ),
	Stacked Data Column( "Data" ),
	Output Table( "Stacked_Data" )
);

//label the 1st column to Lamda
:Column 1 << Set Name("Lamda");
:Lamda << Modeling Type(nominal);

// Create a new JMP DateTime column
dtStack << New Column( "Timestamp",
	Numeric,
	"Continuous",
	Format( "d/m/y h:m:s", 26, 3 ),
	Input Format( "d/m/y h:m:s", 0 ),
	Formula( Informat( Word( 1, :Label, "Z" ), "d/m/y h:m:s" ) )
);
// Remove formula to convert column to static values
dtStack:Timestamp << delete formula;

// Create the Time variable column
dtstack << New Column( "Time",
	Numeric,
	"Continuous",
	Format( "hr:m:s", 17, 3 ),
	Input Format( "hr:m:s", 0 ),
	Formula( Time Of Day( :Timestamp ) )
);

//Create Absolute Time column, uses if>then error checking for Lamda group changing
dtstack << New Column( "AbsTime",
	Numeric,
	"Continuous",
	Format( "Fixed Dec", 12, 1 ),
	Formula(
		If( Row() == 1 | :Lamda != Lag( :Lamda ),
			0,
			Lag( :AbsTime ) + (:Timestamp - Lag( :Timestamp ))
		)
	),
);
// Remove formula to convert column to static values
dtStack:AbsTime << delete formula;

// Get rid of no longer needed "Label" column
dtStack << delete columns( "Label" );

 

5 REPLIES 5
txnelson
Super User

Re: how to cycle a script through multiple csv files

Basically, all you have to do, is to

  1. Use "Files In Directory() to get the list of files to be processed
  2. Set up a data table to be used for the final data table( dtFinal )
  3. Use a For() loop to loop across all of the files found in the directory creating the deStack table
    1. Add the Source column to dtStack
    2. Concatenate dtStack to the dtFinal table
    3. Close the dtStack and dt data tables

Below is your script modified with the above items.  The code is not tested, and you have to add in the directory etc. to read the files from.  It should give you a real good starting point for getting your final product

names default to here(1);

//Prompt to select and open an MS Excel file
//dt = Open( Pick File( "Select Excel File", "C:\Users\folder\", {"Excel Files|csv;xlsx;xls", "All Files|*"}, 1, 0 ), Invisible );

// Get a list of files from the directory
fileList = Files In Directory( <path to your directory? );

// Create a skeleton data table to end up being the combined data table
dtFinal = New Table("Final");

// Loop across all of the files found in the directory and read them in
For(i = 1, i <= N Items( fileList ), i++,

dt = open( <path to your directory\> || fileList[i] );

// Get the column names from the data table
colNames = dt << get column names;
// Remove the 1st col from the list
remove from(colNames,1,1);

// Stack the data
dtStack = dt << Stack(
	columns(
		colNames
	),
	Source Label Column( "Label" ),
	Stacked Data Column( "Data" ),
	Output Table( "Stacked_Data" )
);

//label the 1st column to Lamda
:Column 1 << Set Name("Lamda");
:Lamda << Modeling Type(nominal);

// Create a new JMP DateTime column
dtStack << New Column( "Timestamp",
	Numeric,
	"Continuous",
	Format( "d/m/y h:m:s", 26, 3 ),
	Input Format( "d/m/y h:m:s", 0 ),
	Formula( Informat( Word( 1, :Label, "Z" ), "d/m/y h:m:s" ) )
);
// Remove formula to convert column to static values
dtStack:Timestamp << delete formula;

// Create the Time variable column
dtstack << New Column( "Time",
	Numeric,
	"Continuous",
	Format( "hr:m:s", 17, 3 ),
	Input Format( "hr:m:s", 0 ),
	Formula( Time Of Day( :Timestamp ) )
);

//Create Absolute Time column, uses if>then error checking for Lamda group changing
dtstack << New Column( "AbsTime",
	Numeric,
	"Continuous",
	Format( "Fixed Dec", 12, 1 ),
	Formula(
		If( Row() == 1 | :Lamda != Lag( :Lamda ),
			0,
			Lag( :AbsTime ) + (:Timestamp - Lag( :Timestamp ))
		)
	),
);
// Remove formula to convert column to static values
dtStack:AbsTime << delete formula;

// Get rid of no longer needed "Label" column
dtStack << delete columns( "Label" );

// Add filename as a column to the data table
dtStack << new column( "Source", character, set each value( filename[i] ) );

// Concatenate to final table
dtFinal << concatenate( dtStack, append to first table(1) );

// Clean up
close( dtStack, nosave );
close( dt, nosave );

// End For() Loop
);
Jim
Craige_Hales
Super User

Re: how to cycle a script through multiple csv files

Also check out Multiple File Import. It loads CSV files, concatenates them if they have the same columns, and can optionally add the file name column.

Craige
aliegner1
Level IV

Re: how to cycle a script through multiple csv files

I looked at this but it doesn't seem to fit my needs. It will just open up individual jmp dt's. The column #'s are dynamic, based on the time range for each process file.

aliegner1
Level IV

Re: how to cycle a script through multiple csv files

Awesome. It seems to be working, but then errors out after the first file w/ an

"Name Unresolved: filenam}1} in access or evaluation of 'filename' , filename/*###*/"

 

any thoughts?

I also adjusted the script to do a select folder popup, i assume that isn't the problem.

// Get a list of files from the directory
dir = Pick Directory( "Select a directory", "C:\Users\folder\");
fileList = Files In Directory( dir );

// Create a skeleton data table to end up being the combined data table
dtFinal = New Table("Final");

// Loop across all of the files found in the directory and read them in
For(i = 1, i <= N Items( fileList ), i++,

dt = open( dir || fileList[i] );
Craige_Hales
Super User

Re: how to cycle a script through multiple csv files

Looks like the code wants to use filelist rather than filename for the list of file names. The one near the end.

Craige