cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
View Original Published Thread

Scripting to Combine Columns based on Formula

MikeDereviankin
Level IV

I have a template dataframe that contains 209 columns labelled 1-209. Often a client will forward me datasets that has combined columns, for instance, column 2 may be added with column 3 (labelled '2 + 3') in the new dataset.

 

How can I extract the columns in the new dataset and use these as formulas to make the columns the same in the template dataframe? I know JMP has a combine function, but how can I write a script that extracts all column names and converts those into formulas to align the two datasets. 

I have attached what the template and target datasets look like within this email. 

M. Dereviankin
1 ACCEPTED SOLUTION

Accepted Solutions
SDF1
Super User


Re: Scripting to Combine Columns based on Formula

Hi @MikeDereviankin ,

 

  I'm not sure that I completely understood what you were after, but I think so, and worked on it for about a day. (I'm not as quick a programmer or JSL savvy as @txnelson , he knows his stuff!). I also didn't try his code, but I came at it thinking about it differently (I think).

 

  The below code will take the column names from the "target" data table that have "+" in them, create a new formula column in the "template" data table that adds those specific columns together. It then moves them to the proper location in the data table and deletes the original columns that were added together. It also removes the formula property from the column.

 

  It took about a day to search up the right way to call the column names because they're numbers. It's a lot easier to have character names in order to reference them without the Eval() and Parse() commands, etc. I'm sure someone better can make it more elegant, but this works (at least the way I interpreted you asking for how to do it). I put a Wait() command in there, but you probably don't need that.

Names Default To Here( 1 );

//defines the two data tables, they must be open already and named appropriately
dt_target = Data Table( "Target" );
dt_template = Data Table( "Template" );

//This gets the column names for each table
target_col_names = dt_target << get columnnames( Numeric, string );
template_col_names = dt_template << get column Names( Numeric, string );

//This tests the column names in the target data table to see if there is a "+" and if so puts that column name in the target_col_keep list
target_col_keep = {};
For( i = 1, i <= N Cols( dt_target ) - 1, i++,
	If( Contains( target_col_names[i], "+" ) != 0,
		Insert Into( target_col_keep, target_col_names[i] )
	)
);

//This loops through the items in target_col_keep to make new columns and add formulas.
//The complicated part of As Column(Num(Word(Eval(EvalExpr(Expr()))))) basically converts the number names to a reference-able column.
//The extra +1 is there because of the first column in the dt_template data table -- it offsets the column numbers on down the line in //reference to the target data table.
//The first word command gets only the first column of the added up to the "+", hence Word(1,...)
//The second word command gets only the second word after the "+", hence Word(2,...)
//The new columns are added to the end of the data table.
i=.;//reset counter
For( i = 1, i <= N Items( target_col_keep ), i++,
	dt_template << New Column( target_col_keep[i],
		"Numeric",
		"continuous",
		Formula(
			As Column( (Num( Word( 1, Eval( Eval Expr( Expr( target_col_keep[i] ) ) ), "+" ) ) + 1) ) + As Column(
				(Num( Word( 2, Eval( Eval Expr( Expr( target_col_keep[i] ) ) ), "+" ) ) + 1)
			)
		)
	);
	
	//this step moves the newly created column after the second column of the sum. Comment out if you don't want to move them.
	prevcol = Num( Word( 1, Eval( Eval Expr( Expr( target_col_keep[i] ) ) ), "+" ) );
	dt_template << select Columns( As Column( Eval( Eval Expr( Expr( target_col_keep[i] ) ) ) ) ) <<
	Move Selected Columns( After( Column( Parse( ":Name(\!"" || Char( prevcol ) || "\!")" ) ) ) );
);

Wait(0.5);

//This part removes the formula. Comment out if you don't want to remove the formula.
j = .;//reset counter
For( j = 1, j <= N Col( dt_template ), j++,
	As Column( j ) << Delete property( "Formula" )
);


//This deletes the old columns that were added together. Comment out if you don't want to delete them.
k=.;//reset counter
For( k = 1, k <= N Items( target_col_keep ), k++,
	str1 = Num( Word( 1, Eval( Eval Expr( Expr( target_col_keep[k] ) ) ), "+" ) );
	str2 = Num( Word( 2, Eval( Eval Expr( Expr( target_col_keep[k] ) ) ), "+" ) );
	dt_template << Delete Columns( Column( Parse( ":Name(\!"" || Char( str1 ) || "\!")" ) ) );
	dt_template << Delete Columns( Column( Parse( ":Name(\!"" || Char( str2 ) || "\!")" ) ) );
);

Hope this helps!,

DS

View solution in original post

4 REPLIES 4
txnelson
Super User


Re: Scripting to Combine Columns based on Formula

Here is a simple script that expands on the combined columns.  Note: you have an error in the column names in the Template data table.  Column name 95 appears twice.  The second one is supposed to be 98.

Names Default To Here( 1 );

// Point to the 2 data tables....assumption is they are open
dtTemplate = Data Table( "Template_Template" );
dtTarget = Data Table( "Target_Target" );

// Create a template data table with just the column names 
dtHeader = dtTemplate << subset( selectd rows( 0 ), selected columns( 0 ) );
dtHeader << select where( Row() > 0 );
dtHeader << delete rows;

// Change the column data type to match in incoming target data
dtHeader:Column 1 << datatype( character );

// Concatenate the data together.  JMP will match all columns with the
// same name, and create new columns for ones that don't have a match
dtInterim = dtHeader << concatenate( dtTarget );

// Loop across the new columns and assign the values back to the columns
// the values need to be in
For( i = N Cols( dtHeader ) + 1, i <= N Cols( dtInterim ), i++,
	start = Num( Word( 1, Column( dtInterim, i ) << get name, "+" ) );
	end = Num( Word( 2, Column( dtInterim, i ) << get name, "+" ) );
	For( k = start, k <= end, k++,
		Column( dtInterim, Char( k ) )[1] = Column( dtInterim, i )[1]
	);
);

// Delete the extra columns
dtInterim << delete columns( index(N Cols( dtHeader ) + 1, N Cols( dtInterim )));
Jim
SDF1
Super User


Re: Scripting to Combine Columns based on Formula

Hi @MikeDereviankin ,

 

  I'm not sure that I completely understood what you were after, but I think so, and worked on it for about a day. (I'm not as quick a programmer or JSL savvy as @txnelson , he knows his stuff!). I also didn't try his code, but I came at it thinking about it differently (I think).

 

  The below code will take the column names from the "target" data table that have "+" in them, create a new formula column in the "template" data table that adds those specific columns together. It then moves them to the proper location in the data table and deletes the original columns that were added together. It also removes the formula property from the column.

 

  It took about a day to search up the right way to call the column names because they're numbers. It's a lot easier to have character names in order to reference them without the Eval() and Parse() commands, etc. I'm sure someone better can make it more elegant, but this works (at least the way I interpreted you asking for how to do it). I put a Wait() command in there, but you probably don't need that.

Names Default To Here( 1 );

//defines the two data tables, they must be open already and named appropriately
dt_target = Data Table( "Target" );
dt_template = Data Table( "Template" );

//This gets the column names for each table
target_col_names = dt_target << get columnnames( Numeric, string );
template_col_names = dt_template << get column Names( Numeric, string );

//This tests the column names in the target data table to see if there is a "+" and if so puts that column name in the target_col_keep list
target_col_keep = {};
For( i = 1, i <= N Cols( dt_target ) - 1, i++,
	If( Contains( target_col_names[i], "+" ) != 0,
		Insert Into( target_col_keep, target_col_names[i] )
	)
);

//This loops through the items in target_col_keep to make new columns and add formulas.
//The complicated part of As Column(Num(Word(Eval(EvalExpr(Expr()))))) basically converts the number names to a reference-able column.
//The extra +1 is there because of the first column in the dt_template data table -- it offsets the column numbers on down the line in //reference to the target data table.
//The first word command gets only the first column of the added up to the "+", hence Word(1,...)
//The second word command gets only the second word after the "+", hence Word(2,...)
//The new columns are added to the end of the data table.
i=.;//reset counter
For( i = 1, i <= N Items( target_col_keep ), i++,
	dt_template << New Column( target_col_keep[i],
		"Numeric",
		"continuous",
		Formula(
			As Column( (Num( Word( 1, Eval( Eval Expr( Expr( target_col_keep[i] ) ) ), "+" ) ) + 1) ) + As Column(
				(Num( Word( 2, Eval( Eval Expr( Expr( target_col_keep[i] ) ) ), "+" ) ) + 1)
			)
		)
	);
	
	//this step moves the newly created column after the second column of the sum. Comment out if you don't want to move them.
	prevcol = Num( Word( 1, Eval( Eval Expr( Expr( target_col_keep[i] ) ) ), "+" ) );
	dt_template << select Columns( As Column( Eval( Eval Expr( Expr( target_col_keep[i] ) ) ) ) ) <<
	Move Selected Columns( After( Column( Parse( ":Name(\!"" || Char( prevcol ) || "\!")" ) ) ) );
);

Wait(0.5);

//This part removes the formula. Comment out if you don't want to remove the formula.
j = .;//reset counter
For( j = 1, j <= N Col( dt_template ), j++,
	As Column( j ) << Delete property( "Formula" )
);


//This deletes the old columns that were added together. Comment out if you don't want to delete them.
k=.;//reset counter
For( k = 1, k <= N Items( target_col_keep ), k++,
	str1 = Num( Word( 1, Eval( Eval Expr( Expr( target_col_keep[k] ) ) ), "+" ) );
	str2 = Num( Word( 2, Eval( Eval Expr( Expr( target_col_keep[k] ) ) ), "+" ) );
	dt_template << Delete Columns( Column( Parse( ":Name(\!"" || Char( str1 ) || "\!")" ) ) );
	dt_template << Delete Columns( Column( Parse( ":Name(\!"" || Char( str2 ) || "\!")" ) ) );
);

Hope this helps!,

DS


Re: Scripting to Combine Columns based on Formula

Hi @SDF1,

This worked perfectly! Thanks for your help.
M. Dereviankin
SDF1
Super User


Re: Scripting to Combine Columns based on Formula

Glad it works! You're welcome.