cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
aliegner1
Level IV

Script to loop thru and group columns by name, dynamically

So I'm trying to figure this out and its just not quite working.

 

I've got a table with a key-value pair that I've SPLIT from Tall to Wide. I end up with about 10k columns that all have a regular naming structure, like 
"15.02.0 param1", "15.02 param2", "15.02 param3"...."15.03 param1", "15.03 param2", "15.03 param3",...

 

How can I look through all these columns and group them dynamically by the starting term? ie. all 15.02% columns in a group, all 15.03% cols, all 15.04%, ... all 17.06%....and name the group based on the groupby number?

 

My script just groups everything into a single group. ug.

 

//**split Aggregate Data from TALL to WIDE, grouping WIDE columns by StepID
dt = current data table();
Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's

//split the table TALL to WIDE
dt << Split(
       Split By( :RecipeStepId, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
); dtAggWide = current data table(); ColNamesList = dtAggWide << get column names(string, continuous); //get the column names of the WIDE table //Create an empty list for the matching columns RecipeStepList = {}; // Find matching column names For( i = 1, i <= N items( RecipeStepIndex ), i++, For( j = 1, j <= N Items( ColNamesList ), j++, If( Contains( ColNamesList[j], RecipeStepIndex[i] ), Insert Into( RecipeStepList, ColNamesList[j] ), ); ); ); // If groups of columns are found, create Col groupings If( N Items( RecipeStepList ) > 0, dtAggWide << select columns( RecipeStepList ); dtAggWide << group columns(); );
2 ACCEPTED SOLUTIONS

Accepted Solutions
jthi
Super User

Re: Script to loop thru and group columns by name, dynamically

I think that error means that you don't have a key in associative array. If you tried the second code, does your code have this line:

 

aa_groups = Associative Array(RecipeStepIndex, Repeat({{}}, N Items(RecipeStepIndex)));

it is used to initialize aa_groups. Also if the RecipeStepIndex do not match with col_group = Word(1, col_name, " "); you will get that error. You might have to add some extra handling for cases like that (add group which will collect all the missing columns or only add columns which are found).

 

 

This should skip columns which do not have matching group.

 

For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = Eval List({col_name});
	,
		Insert Into(aa_groups[col_group], col_name);
	);
);

 

 

-Jarmo

View solution in original post

aliegner1
Level IV

Re: Script to loop thru and group columns by name, dynamically

gotchya. I figured out how to get both versions to work! Big thanks and kudos.

 

For anyone in the future, here's my finished versions:

 

V1=

//**split Aggregate Data from TALL to WIDE, grouping WIDE columns by StepID
dt = current data table();

//Build and prefill an Array of names of the RecipeStepId
Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's
aa_groups = Associative Array(RecipeStepIndex, Repeat({{}}, N Items(RecipeStepIndex)));

//split the table TALL to WIDE
dt << Split(
       Split By( :RecipeStepId, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
);

//Group the Cols by name of RecipeStepId
col_list = dtAggWIDE << Get Column Names("String"); //get the col list
For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = Eval List({col_name});
	,
		Insert Into(aa_groups[col_group], col_name);
	);
);

For Each({{key, value}}, aa_groups,					//perform the grouping
	dtAggWide << Group Columns(key, value); 
);

 

V2=

//**split Aggregate Data from TALL to WIDE, grouping WIDE columns by StepID
dt = current data table();

//Build and prefill an Array of names of the RecipeStepId
Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's
aa_groups = Associative Array();

//split the table TALL to WIDE
dt << Split(
       Split By( :RecipeStepId, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
);

//Group the Cols by name of RecipeStepId
col_list = dtAggWide << Get Column Names("String");
For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = {};
	);
	Insert Into(aa_groups[col_group], col_name)
);

For Each({{key, value}}, aa_groups,					//perform the grouping
	dtAggWide << Group Columns(key, value); 
);

View solution in original post

11 REPLIES 11
jthi
Super User

Re: Script to loop thru and group columns by name, dynamically

There are many ways to do this.

 

Here is one example which first builds the groups into associative array with the key as name of the group and value is list of columns in the group, then loops over that associative array to group columns in the data table. To get group names we use Word() function on the column names. This requires JMP16 due to using For Each

 

Names Default To Here(1);
// this solution requires JMP16

dtAggWide = New Table("Untitled",
	Add Rows(0),
	New Column("15.02.0 param1", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.02.0 param2", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.02.0 param3", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.03 param1", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.03 param2", Numeric, "Continuous", Format("Best", 12), Set Values([]))
);

col_list = dtAggWide << Get Column Names("String");

// first get the column group names and columns
// in this solution case we use associative array aa_groups = Associative Array(); For Each({col_name}, col_list, col_group = Word(1, col_name, " "); If(!Contains(aa_groups, col_group), aa_groups[col_group] = {}; ); Insert Into(aa_groups[col_group], col_name) ); Show(aa_groups); For Each({{key, value}}, aa_groups, dtAggWide << Group Columns(key, value); );

jthi_0-1647442521941.png

 

-Jarmo
aliegner1
Level IV

Re: Script to loop thru and group columns by name, dynamically

thankyou, will try this shortly.

But what about building the grouping array list dynamically? Basically, each unique value of :RecipeStepId should be a groupby name. I tried doing that at the beginning, before splitting the table, saving it to the "RecipeStepIndex"

Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's

 

 

jthi
Super User

Re: Script to loop thru and group columns by name, dynamically

You could "pre-fill" the Associative array for example with the column names based on RecipeStepIndex. Most likely RecipeStepIndex will be found from the start of split column names so this should still work.

 

 

Names Default To Here(1);
// this solution requires JMP16

dtAggWide = New Table("Untitled",
	Add Rows(0),
	New Column("15.02.0 param1", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.02.0 param2", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.02.0 param3", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.03 param1", Numeric, "Continuous", Format("Best", 12), Set Values([])),
	New Column("15.03 param2", Numeric, "Continuous", Format("Best", 12), Set Values([]))
);

col_list = dtAggWide << Get Column Names("String");

// first get the column group names and columns// in this solution case we use associative array
aa_groups = Associative Array();

//Summarize(dt, RecipeStepIndex = by(:RecipeStepId));//get list of unique RecipeStepID's
RecipeStepIndex = {"15.02.0", "15.03"};
aa_groups = Associative Array(RecipeStepIndex, Repeat({{}}, N Items(RecipeStepIndex)));

For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	Insert Into(aa_groups[col_group], col_name)
);

For Each({{key, value}}, aa_groups,
	dtAggWide << Group Columns(key, value); 
);

 

 

-Jarmo
aliegner1
Level IV

Re: Script to loop thru and group columns by name, dynamically

Hmm. I'm a bit stuck and not sure what you're doing here.

Why still have dtAggWide and col_list?


//*****************************************************************
//**split Aggregate Data from TALL to WIDE, grouping WIDE columns by StepID
dt = current data table();

//Build and prefill an Array of names of the RecipeStepId
aa_groups = Associative Array();
Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's
aa_groups = Associative Array(RecipeStepIndex, Repeat({{}}, N Items(RecipeStepIndex)));

//split the table TALL to WIDE
dt << Split(
       Split By( :RecipeStepId, :RecipeStepName, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
);

//Group the Cols by name of RecipeStepId
For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	Insert Into(aa_groups[col_group], col_name)
);
For Each({{key, value}}, aa_groups,
	dtAggWide << Group Columns(key, value); 
);
jthi
Super User

Re: Script to loop thru and group columns by name, dynamically

dtAggWide is there because it is the data table which has the columns you want to group. In the code above you are missing reference to it, easiest and best way to add it would be (this will avoid using Current Data Table() which might or might not work):

dtAggWide = dt << Split(
       Split By( :RecipeStepId, :RecipeStepName, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
);

And col_list is there to loop over the columns in dtAggWide and then add the columns to correct lists in the aa_groups associative array

-Jarmo
aliegner1
Level IV

Re: Script to loop thru and group columns by name, dynamically

OK, I think I see what you mean, I was confused why you were creating a new dt, but looks like just for the example. 

 

your second set of code doesn't seem to work for me, but the first set did. Any idea what this error means?

aliegner1_0-1647450644509.png

 

//Group the Cols by name of RecipeStepId
col_list = dtAggWIDE << Get Column Names("String");

For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = {};
	);
	Insert Into(aa_groups[col_group], col_name)
);

Show(aa_groups);

For Each({{key, value}}, aa_groups,
	dtAggWide << Group Columns(key, value); 
);

aliegner1_1-1647450778727.png

 

jthi
Super User

Re: Script to loop thru and group columns by name, dynamically

I think that error means that you don't have a key in associative array. If you tried the second code, does your code have this line:

 

aa_groups = Associative Array(RecipeStepIndex, Repeat({{}}, N Items(RecipeStepIndex)));

it is used to initialize aa_groups. Also if the RecipeStepIndex do not match with col_group = Word(1, col_name, " "); you will get that error. You might have to add some extra handling for cases like that (add group which will collect all the missing columns or only add columns which are found).

 

 

This should skip columns which do not have matching group.

 

For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = Eval List({col_name});
	,
		Insert Into(aa_groups[col_group], col_name);
	);
);

 

 

-Jarmo
aliegner1
Level IV

Re: Script to loop thru and group columns by name, dynamically

gotchya. I figured out how to get both versions to work! Big thanks and kudos.

 

For anyone in the future, here's my finished versions:

 

V1=

//**split Aggregate Data from TALL to WIDE, grouping WIDE columns by StepID
dt = current data table();

//Build and prefill an Array of names of the RecipeStepId
Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's
aa_groups = Associative Array(RecipeStepIndex, Repeat({{}}, N Items(RecipeStepIndex)));

//split the table TALL to WIDE
dt << Split(
       Split By( :RecipeStepId, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
);

//Group the Cols by name of RecipeStepId
col_list = dtAggWIDE << Get Column Names("String"); //get the col list
For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = Eval List({col_name});
	,
		Insert Into(aa_groups[col_group], col_name);
	);
);

For Each({{key, value}}, aa_groups,					//perform the grouping
	dtAggWide << Group Columns(key, value); 
);

 

V2=

//**split Aggregate Data from TALL to WIDE, grouping WIDE columns by StepID
dt = current data table();

//Build and prefill an Array of names of the RecipeStepId
Summarize( dt, RecipeStepIndex = by( :RecipeStepId) );//get list of unique RecipeStepID's
aa_groups = Associative Array();

//split the table TALL to WIDE
dt << Split(
       Split By( :RecipeStepId, :ParameterName, :AggregateFunction ),
       Split( :Value ),
       Group( :Recipe, :ProcessTime, :Tool, :Chamber, :Lot, :WaferScribe ),
       Output Table( "AggWide" ),
       Remaining Columns( Drop All ),
       Sort by Column Property,
       dt << sort( by( :Recipe, :Tool, :Chamber, :Lot, :WaferScribe ))
);

//Group the Cols by name of RecipeStepId
col_list = dtAggWide << Get Column Names("String");
For Each({col_name}, col_list, 
	col_group = Word(1, col_name, " ");
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = {};
	);
	Insert Into(aa_groups[col_group], col_name)
);

For Each({{key, value}}, aa_groups,					//perform the grouping
	dtAggWide << Group Columns(key, value); 
);
SpannerHead
Level IV

Re: Script to loop thru and group columns by name, dynamically

Nice script!  I tweaked it a bit to group all data columns with the matching first 4 characters in the column title.  The data I get is named in that way and this makes it much easier to look at like data kinds.

 

dt = Current Data Table();

col_list = dt << Get Column Names("String");

// first get the column group names and columns// in this solution case we use associative array
aa_groups = Associative Array();
For Each({col_name}, col_list, 
	col_group = Substr( col_name, 1, 4 );
	If(!Contains(aa_groups, col_group),
		aa_groups[col_group] = {};
	);
	Insert Into(aa_groups[col_group], col_name)
);
Show(aa_groups);

For Each({{key, value}}, aa_groups,
	dt << Group Columns(key, value); 
);

 


Slán



SpannerHead