cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
mlipsitz
Level I

Transpose header data and concatenate files

I have a set of raw data csv files with metadata in the header that I want to concatenate and copy the header data into columns. I have been able to modify the "combine txt tables.jsl" script I found on this site to pull in the header data, transpose it and join it to the raw data table. The header data is contained within each raw data file on rows 2 through 29, and I transpose it to make the first column into column names and the second column into the metadata, duplicated down each raw data row. For some reason it is only pulling in the first file and doesn't continue the import loop. My script is as follows:

names default to here (1);
Clear Symbols();
//set up variables with starting values
filelist={};   //empty list

//launch dialog to get the file location
prefilepath = munger(Pick Directory( "Browse to the Directory of the .txt / .csv files " ),1,"/","");
filepath = Convert File Path( prefilepath, Windows );
prefilelist = Files In Directory( filepath );
n2=nitems(prefilelist);

//filter out any non-txt or csv files
For( i2 = 1, n2 >= i2, i2++,
	file=(prefilelist[i2]);
	// changed from item 2 to item -1
	If( Item( -1, prefilelist[i2], "." ) == "txt" | Item( -1, prefilelist[i2], "." ) == "csv",
		 Insert Into( filelist,file),
		 show(file)
	)
);

//filter out any non-Excel files for the file list. 
For( i2 = 1, n2 >= i2, i2++,
	file=(prefilelist[i2]);
	If( Item( -1, prefilelist[i2], "." ) == "xls" | Item( -1, prefilelist[i2], "." ) == "xlsx",
		 Insert Into( filelist,file),
		 show(file)
	)
);

//Create raw data table:
nf=nitems(filelist); //number of items in the working list

For( iii = 1 , iii <= nf, iii++, //this starts the first loop
filenow = ( filelist[iii] );
fileopen=(filepath||filenow);
dt=open(fileopen,private,
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 1 ),
		Column Names Start( 1 ),
		Data Starts( 2 ),
		Lines To Read( 29 ),
		Year Rule( "20xx" )
	)
);

//Transpose header data into new columns and create "Transposed Header Table"
dt<<( "Header table" ) << Transpose(
	columns( :TTV0.000000 ),
	Label( :Name( "Test Name:" ) ),
	Output Table( "Transposed Header table" )
);

//New Column( "Source", Character, Nominal );
//:Source << set each value( filenow );

dt=open(fileopen,private,
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 1 ),
		Column Names Start( 32 ),
		Data Starts( 33 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);


New Column( "Source", Character, Nominal );
:Source << set each value( filenow );

//Merge header table with raw data into "Combined Data Table"
dtjoined=dt<< Join(
	With( Data Table( "Transposed Header table" ) ),
	Cartesian Join
); 

//add the current table to the bottom of the combined data table
dtjoined << Concatenate( Data Table( dt ), Append to first table );
dtjoined << Set Name("Combined Data Table");
Close(Data Table("Transposed Header table"),NoSave);
Close(Data Table("Header table"),NoSave);

dt << Run Formulas();

);//end of the first for loop

I think the loop stops because of table naming. If I comment out the "Close(Data Table("Header table"),NoSave);" script it imports the first file correctly then stops. If I take out the comment I get a data table with the first set of raw data and header data imported as expected plus the second data file's worth of raw data missing the header data. It never makes it beyond that second file. Here is an example showing how the raw data file is formatted, with metada on rows 1 through 3 and raw data starting on row 5:

Test cycle1  
LaboratoryLab_A  
Serial Number123ABC  
TimeMetric AMetric BMetric C
117060
2.17161.1
3.27362.5
4.37863.8
5.48065.7
6.58569.2
7.68972.3
2 ACCEPTED SOLUTIONS

Accepted Solutions
gzmorgan0
Super User (Alumni)

Re: Transpose header data and concatenate files

@mlipsitz, there are multiple problems with this script. I think you need to do more work learning basic JSL syntax.  Yes, we all learn a language by taking already written code and modifying it, but you need some basics.  For example this portion of your script looks like you were trying to rename table dt. There is no place in your code where there is a column ::TTV0.000000 is defined, and this transpose would not do what I think you want done.

 

//Transpose header data into new columns and create "Transposed Header Table"
dt<<( "Header table" ) << Transpose(
	columns( :TTV0.000000 ),
	Label( :Name( "Test Name:" ) ),
	Output Table( "Transposed Header table" )
);

Also, when the code below is run, dt is the table of raw data values before the cartesian join and dtjoined is the table of values with the labels. You are overwriting tables (name collisions).

//add the current table to the bottom of the combined data table
dtjoined << Concatenate( Data Table( dt ), Append to first table );

Here are a couple suggestions:

  • Do not make tables private until your script is written and running properly.
  • dt is just shorthand for a table reference. It is okay to reuse (re-purpose) a variable name, but I do not recommend it for a table until the earlier table is closed. Use variable names that make the code more readable, for example dthdr for the header table and dtx for the transposed table.
  • Close the table as soon as they are no longer needed.
  • Do you really want to join files or concatenate files? If each file has the same metrics, then I think you want concatenate.
  • Test your script with a couple files. In other words, learn how to get a base hit, before swinging for the fences. 

Your read statements did not make sense for your sample table.  Are you sure each of the files have the same format?  Or do you need to search for a pattern to determine where the raw data starts?  This script will work of the file formats are fixed. The number of raw data rows can change.  Note, I hardcoded the list of files to test. The code to get the files looks okay.

 

Names default to Here(1);
filelist = {"example1_20190214.txt", "example2_20190214.txt"};
filepath = "c:\temp\";
df = nitems(filelist);

for(iii=1, iii<=df, iii++,
filenow  = filelist[iii];

fileopen=(filepath||filenow);
dthdr=open(fileopen,//private,
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels(0 ),
		Column Names Start(0 ),
		Data Starts( 1 ),
		Lines To Read( 3 ),
		Year Rule( "20xx" )
	)
);
nc = ncol(dthdr);
dthdr << Combine Columns( Delimiter(" "), Columns(2::nc), Column Name("Info") );
dthdr << delete columns(3::nc);

//Transpose header data into new columns and create "Transposed Header Table"
dthdr<< set name( "Header table" );
dtx = dthdr << Transpose(
	columns( :Info ),
	Label( Column(1) ),
	Output Table( "Transposed Header table" )
);
dtx << delete column (:Label);

//Close Header table
Close(dthdr,Nosave); 

//New Column( "Source", Character, Nominal );
//:Source << set each value( filenow );

dt=open(fileopen,//private,
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 1 ),
		Column Names Start( 4 ),
		Data Starts( 5 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);


New Column( "Source", Character, Nominal );
:Source << set each value( filenow );

//Merge Transposed Header table, dtx,  with raw data table, dt
dtjoined=dt<< Join(
	With( dtx ),
	Cartesian Join
); 

//close Transposed Header table
Close(dtx, NoSave);
//close raw data table, dt
Close(dt, NoSave);

//For the first file, keep it, it will be used to append other file
//For the other files, append then close 
If (iii==1 , dtfinal = dtjoined,
	//else
	current data table(dtfinal);
	dtfinal << Concatenate( dtjoined , Append to first table(1));
	wait(0);
	close(dtjoined, NoSave)
);


//no need for Run Formulas() there are no formulas in the table	

); //end the for iii loop 

//Give the final table a name
dtfinal << set name( "Combined Data Table"); 

I tested this script using the file you displayed and a modification of that file. Here is the result using the script above.

image.png

View solution in original post

mlipsitz
Level I

Re: Transpose header data and concatenate files

Thank you very much ian_jmp and gzmorgan0. I agree I was struggling with some of the syntax and trying to combine code from various sources. I was able to get gzmorgan0's code to work with some slight adjustments. See the attached script. Yes, all of my files are formatted identically from an automated data acquisition system. Thanks again for your help with this and for the syntax lesson!

View solution in original post

6 REPLIES 6
ian_jmp
Level X

Re: Transpose header data and concatenate files

Using the format at the bottom of your message, here is some code that will process .CSV files on the desktop (of a Mac)

NamesDefaultToHere(1);

// Get a list of .csv files on the desktop
fileList = FilesInDirectory("$DESKTOP");
for(f=NItems(fileList), f>=1, f--, If(!EndsWith(Uppercase(fileList[f]), ".CSV"), RemoveFrom(fileList, f)));
fileList;

// Process each .csv file
For(f=1, f<=NItems(fileList), f++,
	// Get metadata (generated by JMP)
	dtm = Open(
			"$DESKTOP/"||fileList[f],
			columns(
				New Column( "Name", Character, "Nominal" ),
				New Column( "Value", Character, "Nominal" ),
			),
			Import Settings(
				End Of Line( CRLF, CR, LF ),
				End Of Field( Comma, CSV( 0 ) ),
				Strip Quotes( 1 ),
				Use Apostrophe as Quotation Mark( 0 ),
				Use Regional Settings( 0 ),
				Scan Whole File( 1 ),
				CompressNumericColumns( 0 ),
				CompressCharacterColumns( 0 ),
				CompressAllowListCheck( 0 ),
				Labels( 0 ),
				Column Names Start( 1 ),
				Data Starts( 1 ),
				Lines To Read( 3 ),
				Year Rule( "20xx" )
				),
			Invisible
			);
	// Get data (generated by JMP)
	dtd = Open(
			"$DESKTOP/"||fileList[f],
			columns(
				New Column( "Time", Numeric, "Continuous", Format( "Best", 12 ) ),
				New Column( "Meric A", Numeric, "Continuous", Format( "Best", 12 ) ),
				New Column( "Metric B", Numeric, "Continuous", Format( "Best", 12 ) ),
				New Column( "Metric C", Numeric, "Continuous", Format( "Best", 12 ) )
			),
			Import Settings(
				End Of Line( CRLF, CR, LF ),
				End Of Field( Comma, CSV( 0 ) ),
				Strip Quotes( 1 ),
				Use Apostrophe as Quotation Mark( 0 ),
				Use Regional Settings( 0 ),
				Scan Whole File( 1 ),
				Treat empty columns as numeric( 0 ),
				CompressNumericColumns( 0 ),
				CompressCharacterColumns( 0 ),
				CompressAllowListCheck( 0 ),
				Labels( 1 ),
				Column Names Start( 4 ),
				Data Starts( 5 ),
				Lines To Read( "All" ),
				Year Rule( "20xx" )
				),
			Invisible
			);
	// Add the values from dtm to dtd
	for (c=1, c<=NRow(dtm), c++,
		vals = {}; for(r=1, r<=NRow(dtd), r++, InsertInto(vals, Column(dtm, 2)[c]));
		dtd << newColumn(Column(dtm, 1)[c], Character, Values(vals));
		);
	Close(dtm, NoSave);
	// Append tables
	if(f==1, 
		dtFinal = Eval(dtd << GetScript),
		dtFinal << Concatenate(dtd, AppendToFirstTable);
		);
	Close(dtd, NoSave);
	);
	
// Tidy up dtFinal
dtFinal << ShowWindow(1);
dtFinal << setName("Final Data");
dtFinal << deleteProperty("Source");
gzmorgan0
Super User (Alumni)

Re: Transpose header data and concatenate files

@mlipsitz, there are multiple problems with this script. I think you need to do more work learning basic JSL syntax.  Yes, we all learn a language by taking already written code and modifying it, but you need some basics.  For example this portion of your script looks like you were trying to rename table dt. There is no place in your code where there is a column ::TTV0.000000 is defined, and this transpose would not do what I think you want done.

 

//Transpose header data into new columns and create "Transposed Header Table"
dt<<( "Header table" ) << Transpose(
	columns( :TTV0.000000 ),
	Label( :Name( "Test Name:" ) ),
	Output Table( "Transposed Header table" )
);

Also, when the code below is run, dt is the table of raw data values before the cartesian join and dtjoined is the table of values with the labels. You are overwriting tables (name collisions).

//add the current table to the bottom of the combined data table
dtjoined << Concatenate( Data Table( dt ), Append to first table );

Here are a couple suggestions:

  • Do not make tables private until your script is written and running properly.
  • dt is just shorthand for a table reference. It is okay to reuse (re-purpose) a variable name, but I do not recommend it for a table until the earlier table is closed. Use variable names that make the code more readable, for example dthdr for the header table and dtx for the transposed table.
  • Close the table as soon as they are no longer needed.
  • Do you really want to join files or concatenate files? If each file has the same metrics, then I think you want concatenate.
  • Test your script with a couple files. In other words, learn how to get a base hit, before swinging for the fences. 

Your read statements did not make sense for your sample table.  Are you sure each of the files have the same format?  Or do you need to search for a pattern to determine where the raw data starts?  This script will work of the file formats are fixed. The number of raw data rows can change.  Note, I hardcoded the list of files to test. The code to get the files looks okay.

 

Names default to Here(1);
filelist = {"example1_20190214.txt", "example2_20190214.txt"};
filepath = "c:\temp\";
df = nitems(filelist);

for(iii=1, iii<=df, iii++,
filenow  = filelist[iii];

fileopen=(filepath||filenow);
dthdr=open(fileopen,//private,
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels(0 ),
		Column Names Start(0 ),
		Data Starts( 1 ),
		Lines To Read( 3 ),
		Year Rule( "20xx" )
	)
);
nc = ncol(dthdr);
dthdr << Combine Columns( Delimiter(" "), Columns(2::nc), Column Name("Info") );
dthdr << delete columns(3::nc);

//Transpose header data into new columns and create "Transposed Header Table"
dthdr<< set name( "Header table" );
dtx = dthdr << Transpose(
	columns( :Info ),
	Label( Column(1) ),
	Output Table( "Transposed Header table" )
);
dtx << delete column (:Label);

//Close Header table
Close(dthdr,Nosave); 

//New Column( "Source", Character, Nominal );
//:Source << set each value( filenow );

dt=open(fileopen,//private,
	Import Settings(
		End Of Line( CRLF, CR, LF ),
		End Of Field( Tab, Comma, CSV( 0 ) ),
		Strip Quotes( 1 ),
		Use Apostrophe as Quotation Mark( 0 ),
		Use Regional Settings( 0 ),
		Scan Whole File( 1 ),
		Treat empty columns as numeric( 0 ),
		CompressNumericColumns( 0 ),
		CompressCharacterColumns( 0 ),
		CompressAllowListCheck( 0 ),
		Labels( 1 ),
		Column Names Start( 4 ),
		Data Starts( 5 ),
		Lines To Read( "All" ),
		Year Rule( "20xx" )
	)
);


New Column( "Source", Character, Nominal );
:Source << set each value( filenow );

//Merge Transposed Header table, dtx,  with raw data table, dt
dtjoined=dt<< Join(
	With( dtx ),
	Cartesian Join
); 

//close Transposed Header table
Close(dtx, NoSave);
//close raw data table, dt
Close(dt, NoSave);

//For the first file, keep it, it will be used to append other file
//For the other files, append then close 
If (iii==1 , dtfinal = dtjoined,
	//else
	current data table(dtfinal);
	dtfinal << Concatenate( dtjoined , Append to first table(1));
	wait(0);
	close(dtjoined, NoSave)
);


//no need for Run Formulas() there are no formulas in the table	

); //end the for iii loop 

//Give the final table a name
dtfinal << set name( "Combined Data Table"); 

I tested this script using the file you displayed and a modification of that file. Here is the result using the script above.

image.png

gzmorgan0
Super User (Alumni)

Re: Transpose header data and concatenate files

Sorry @ian_jmp , you must have posted your script while i was editing the post.  

ian_jmp
Level X

Re: Transpose header data and concatenate files

No problem - One of my many failings is that I find it hard to work with someone else's code :( . . .

mlipsitz
Level I

Re: Transpose header data and concatenate files

Thank you very much ian_jmp and gzmorgan0. I agree I was struggling with some of the syntax and trying to combine code from various sources. I was able to get gzmorgan0's code to work with some slight adjustments. See the attached script. Yes, all of my files are formatted identically from an automated data acquisition system. Thanks again for your help with this and for the syntax lesson!

txnelson
Super User

Re: Transpose header data and concatenate files

@mlipsitz 

Your thank you response is not an appropriate entry to be tagged as a "Solution".  The readers need to have a pointer to the response(s) that actually contain the solution.  Please select the response(s) that show the solution(s).

Jim