cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
danielrbiber
Level III

Meta Data in Column Name to Row

Greetings,

 

I am in need of help scripting a multi-step process as I am unable to abstract a viable solution to achieve some data preparation work.

I have time series data that comes in as several spread sheets, containing several tabs, and each tab has data with column names that contain meta data contextualizing the data.

In the example below, the columns with "Descriptor" (columns 1,3 and 5) should really be named "Date", but instead contain Descriptors that are needed for analysis.  Additionally, The Unit columns(2,4 and 6) should really be named data but contain the unit of measure.

Big:Small:WetmgBad:Good:UglyHzPort:Starboard:SpeedmL
12/15/2021012/15/2021412/15/20218
12/16/2021112/16/2021512/16/20219
12/17/2021212/17/2021612/17/202110
12/18/2021312/18/2021712/18/202111

 

Ideally, the finished data table would have the following structure.

DateDescriptorType1DescriptorType2DescriptorType3UnitData
12/15/2021BigSmallWetmg0
12/16/2021BigSmallWetmg1
12/17/2021BigSmallWetmg2
12/18/2021BigSmallWetmg3
12/15/2021BadGoodUglyHz4
12/16/2021BadGoodUglyHz5
12/17/2021BadGoodUglyHz6
12/18/2021BadGoodUglyHz7
12/15/2021PortStarboardSpeedmL8
12/16/2021PortStarboardSpeedmL9

 

My attempts to pull the column headers into a separate table and re-join with nested for loops have not proven successful.  Here is code that pulls and prepares the column name data for the example data included.

Names Default to Here(1);

//Get MainDT column Names
colNames = Data Table("Data") << Get Column Names;

//Create Empty Table
dtCols = New Table("DTCols");

//Put in Columns into the table
dtCols << New Column( "Columns", Character, Values(colNames));

//Label every other row for splitting
dtCols << New Column( "Data", formula( sequence(1,2,1)));

//Remove formula from Data column
dtCols:Data << Delete Formula;

//New column from which to split from dtCols << New Column( "Type", formula( If( contains(:Columns, "Desc"), "Meta", "Unit_Raw" ); ); ); //Deleting formula for handling dtCols:Type << Delete formula; //Splitting the row based on the labelling splitdtCols = dtCols << Split( Split By( :Type ), Split( :Columns ), Sort by Column Property, Output Table( "SplitDTCols" ) ); //Text to Columns by ":" splitdtCols << text to columns(delimiter(":"), columns(:Meta));

The nested for loop (which only looks to put the full description back into the original table) is not only relatively slow (this is needed to loop through >1 million entries) but it only writes missing data into the columns.

//Meta column back into Data
For(i = 1, i <= N Items(colNames), i++,
	For(j = 1, j <= N Rows(splitdtCols), j++,
		//show(splitdtCols:Meta[j]);
		If( eval(char(colNames[i])) == eval(char(splitdtCols:Meta[j])),
			Data Table("Data") << New Column("Descriptor", set Values(eval(char(splitdtCols:Meta[j]))))
		)
	)
);

I am willing to abandon the approach of pulling out the column name data into a separate table.  It really feels like there is a simpler approach I am missing.

1 ACCEPTED SOLUTION

Accepted Solutions
Jeff_Perkinson
Community Manager Community Manager

Re: Meta Data in Column Name to Row

Interactively this is easy to do with a Multiple series stack.

 

Use Tables->Stack and add all the columns to the Stack Columns list. Then click the checkbox for Multiple series stack. Set Number of Series to 2 and don't check Contiguous.

2021-12-15_14-36-56.828.png

You'll get a table like this:

2021-12-15_14-37-38.264.png

Rename the Data column to Date and the Label 2 column to Units.

2021-12-15_14-39-17.998.png

Then, you can use Cols->Utilities->Text to Cols... to split the Label column into three columns using a colon (":") as the delimiter.

2021-12-15_14-41-02.162.png

You'll get this:

2021-12-15_14-41-36.106.png

 

If you need to script this, because this is something you need to automate or you do this a lot, you can get the script for the Stack from Source script:

stacked_dt = Data Table( "445470" ) << Stack(
	columns(
		:"Big:Small:Wet"n, :mg, :"Bad:Good:Ugly"n, :Hz, :"Port:Starboard:Speed"n,
		:mL
	),
	Source Label Column( "Label" ),
	Stacked Data Column( "Data" ),
	Number of Series( 2 )
);

You'll need to use the <<Set Name message to change the names of the columns.

stacked_dt:"Data" << Set Name( "Date" );
stacked_dt:"Label 2" << Set Name( "Units" );

Finally Text to Columns is scriptable as well.

stacked_dt << Text To Columns(
	delimiter( ":" ),
	columns( :Label )
);

 

-Jeff

View solution in original post

2 REPLIES 2
Jeff_Perkinson
Community Manager Community Manager

Re: Meta Data in Column Name to Row

Interactively this is easy to do with a Multiple series stack.

 

Use Tables->Stack and add all the columns to the Stack Columns list. Then click the checkbox for Multiple series stack. Set Number of Series to 2 and don't check Contiguous.

2021-12-15_14-36-56.828.png

You'll get a table like this:

2021-12-15_14-37-38.264.png

Rename the Data column to Date and the Label 2 column to Units.

2021-12-15_14-39-17.998.png

Then, you can use Cols->Utilities->Text to Cols... to split the Label column into three columns using a colon (":") as the delimiter.

2021-12-15_14-41-02.162.png

You'll get this:

2021-12-15_14-41-36.106.png

 

If you need to script this, because this is something you need to automate or you do this a lot, you can get the script for the Stack from Source script:

stacked_dt = Data Table( "445470" ) << Stack(
	columns(
		:"Big:Small:Wet"n, :mg, :"Bad:Good:Ugly"n, :Hz, :"Port:Starboard:Speed"n,
		:mL
	),
	Source Label Column( "Label" ),
	Stacked Data Column( "Data" ),
	Number of Series( 2 )
);

You'll need to use the <<Set Name message to change the names of the columns.

stacked_dt:"Data" << Set Name( "Date" );
stacked_dt:"Label 2" << Set Name( "Units" );

Finally Text to Columns is scriptable as well.

stacked_dt << Text To Columns(
	delimiter( ":" ),
	columns( :Label )
);

 

-Jeff
danielrbiber
Level III

Re: Meta Data in Column Name to Row

This was exactly what I was hoping for.  Something simple and that I overlooked.  I was able to implement this pretty quickly.