cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Neal85
Level III

Importing multiple csv files with the same format

I have several csv files with similar format. All csv files have header on row no. 279 & data starts from row no 281. Even though the length of data may vary, the last 4 rows on all csv files have irrelevant information and needs to be taken rid of.

Currently, I am using 'Multiple file import' functionality to import the data, but I don't know how to exclude the last 4 rows. I get rid of the rows after I import in JMP table. Is it possible to create a script to automatically import all the csv files on a directory and concatenate them by excluding the last 4 rows. The script below is the one I get when I import a single csv file. 

Open("C:\Users\TEST\ABC.csv",
	columns(
		New Column( "Record", Character, "Nominal" ),
		New Column( "14", Numeric, "Continuous", Format( "Best", 12 ) ),
		New Column( "ID", Character, "Nominal" ),
		New Column( "XREL", Character, "Nominal" ),
		New Column( "YREL", Character, "Nominal" ),
		),
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Space, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 1 ),
		Column Names Start( 279 ),
		Data Starts( 281 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
)
)

Actually all the columns  contain  numeric values in continuous format. But, the irrelevant information abovementioned makes it to be detected as 'nominal'. 

1 ACCEPTED SOLUTION

Accepted Solutions
txnelson
Super User

Re: Importing multiple csv files with the same format

Here is a modification to your code that finds all of the files in a given directory, opens each one, deletes the last four rows, and then adds it to a concatenation file.

 

The code is not fully tested, since I do not have files that match your input.  but it should be close, and you should be able to work through any minor issues it has.

Names Default To Here( 1 );

dirFiles = Files In Directory( "C:\Users\TEST" );

// Create the output data table to concatenate all files to
dtFinal = New Table( "Final", New Column( "Source", character ) );

For( i = 1, i <= N Items( dirFiles ), i++,
	dt = Open(
		"C:\Users\TEST\" || dirFiles[i],
		columns(
			New Column( "Record", Character, "Nominal" ),
			New Column( "14", Numeric, "Continuous", Format( "Best", 12 ) ),
			New Column( "ID", Character, "Nominal" ),
			New Column( "XREL", Character, "Nominal" ),
			New Column( "YREL", Character, "Nominal" ),

		),
		Import Settings(
			End Of Line( CRLF, CR, LF ),
			End Of Field( Tab, Space, Comma, CSV( 0 ) ),
			Strip Quotes( 1 ),
			Use Apostrophe as Quotation Mark( 0 ),
			Use Regional Settings( 0 ),
			Scan Whole File( 1 ),
			Treat empty columns as numeric( 0 ),
			CompressNumericColumns( 0 ),
			CompressCharacterColumns( 0 ),
			CompressAllowListCheck( 0 ),
			Labels( 1 ),
			Column Names Start( 279 ),
			Data Starts( 281 ),
			Lines To Read( "All" ),
			Year Rule( "20xx" )
		)
	);

// Delete the last 4 rows
	dt << delete rows( Index( N Rows( dt ) - 4, N Rows( dt ) ) );

// Add a source column
	dt << New Column( "Source", character, set each value( dirFiles[i] ) );

// Concatenate the file
	dtFinal << concatenate( dt, Append to first table( 1 ) );

// delete the current input table
	Close( dt, nosave );
);

This will run slower that the Multiple File Import, but it will get the job done

Jim

View solution in original post

2 REPLIES 2
txnelson
Super User

Re: Importing multiple csv files with the same format

Here is a modification to your code that finds all of the files in a given directory, opens each one, deletes the last four rows, and then adds it to a concatenation file.

 

The code is not fully tested, since I do not have files that match your input.  but it should be close, and you should be able to work through any minor issues it has.

Names Default To Here( 1 );

dirFiles = Files In Directory( "C:\Users\TEST" );

// Create the output data table to concatenate all files to
dtFinal = New Table( "Final", New Column( "Source", character ) );

For( i = 1, i <= N Items( dirFiles ), i++,
	dt = Open(
		"C:\Users\TEST\" || dirFiles[i],
		columns(
			New Column( "Record", Character, "Nominal" ),
			New Column( "14", Numeric, "Continuous", Format( "Best", 12 ) ),
			New Column( "ID", Character, "Nominal" ),
			New Column( "XREL", Character, "Nominal" ),
			New Column( "YREL", Character, "Nominal" ),

		),
		Import Settings(
			End Of Line( CRLF, CR, LF ),
			End Of Field( Tab, Space, Comma, CSV( 0 ) ),
			Strip Quotes( 1 ),
			Use Apostrophe as Quotation Mark( 0 ),
			Use Regional Settings( 0 ),
			Scan Whole File( 1 ),
			Treat empty columns as numeric( 0 ),
			CompressNumericColumns( 0 ),
			CompressCharacterColumns( 0 ),
			CompressAllowListCheck( 0 ),
			Labels( 1 ),
			Column Names Start( 279 ),
			Data Starts( 281 ),
			Lines To Read( "All" ),
			Year Rule( "20xx" )
		)
	);

// Delete the last 4 rows
	dt << delete rows( Index( N Rows( dt ) - 4, N Rows( dt ) ) );

// Add a source column
	dt << New Column( "Source", character, set each value( dirFiles[i] ) );

// Concatenate the file
	dtFinal << concatenate( dt, Append to first table( 1 ) );

// delete the current input table
	Close( dt, nosave );
);

This will run slower that the Multiple File Import, but it will get the job done

Jim
Neal85
Level III

Re: Importing multiple csv files with the same format

Thanks Jim.

Your solution saved me alot of time.