Group Columns by a list of values from another column?

aliegner1 · Jun 9, 2023 11:11 AM

I'm working to group several thousand columns together by name that matches with something from a list of values from an initial column

example:

Col1 values = {step1, step2, step3}

Col2:5000 names :step1_blahblahblah, step2_blahblbahblah, etc

I guess I'm not quite sure how to define the if(contain) statement uses the list of values from Col1.

dt = current data table();
Summarize( dt, StepIDIndex= by( :RecipeStepId) );//get list of unique RecipeStepID's
colNamesList = dt << get column names(string, continuous);  //get all the column names
StepList = {}; //Create an empty list for the matching columns

// Find matching column names
For( i = 1, i <= N Items( colNamesList ), i++,
	If( 
		contains( colNamesList[i], StepIDIndex),
		Insert Into( StepList , colNamesList[i] )
	)
);

// If groups of columns are found, create col groupings
If( N Items( StepList ) > 0,
	dt<< select columns( StepList );
	dt<< group columns();
);

jthi · Feb 18, 2022 05:39 PM

If your column names(?) have always step1_something format, you could use Word to separate the step from rest of the string to make comparison easier. Below is an example of what I think you would like to do (requires JMP16 due to For Each):

Names Default To Here(1);

dt = New Table("Untitled",
	Add Rows(3),
	Set Header Height(44),
	New Column("Col1", Character, "Nominal", Set Values({"step1", "step2", "step3"})),
	New Column("step1_a", Numeric, "Continuous", Format("Best", 12), Set Values([., ., .])),
	New Column("step1_b", Numeric, "Continuous", Format("Best", 12), Set Values([., ., .])),
	New Column("step1_c", Numeric, "Continuous", Format("Best", 12), Set Values([., ., .])),
	New Column("step2_a", Numeric, "Continuous", Format("Best", 12), Set Values([., ., .])),
	New Column("step2_b", Numeric, "Continuous", Format("Best", 12), Set Values([., ., .]))
);

Summarize(dt, uniq_vals = by(:col1)); 
col_list = dt << get column names(string, continuous);

//initialize col group collector associative array
col_groups = Repeat({{}}, N Items(uniq_vals));

For Each({col_name}, col_list,
	found_idx = Contains(uniq_vals, Word(1, col_name, "_"));
	If(found_idx,
		Insert Into(col_groups[found_idx], col_name);
	);
);

For Each({col_group, idx}, uniq_vals,
	If(N Items(col_groups[idx]) > 0,
		Eval(EvalExpr(dt << Group Columns(col_group, Expr(col_groups[idx]))));
	);
);

-Jarmo

Group Columns by a list of values from another column?

Re: Group Columns by a list of values from another column?