cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Isabel26
Level III

Ordering columns based on the frequency not names...

I know how to do it in SAS but I cannot figure this out in JMP!!! As attached dummy data, I have unique ID for each row and different variables that either 1 or 0. I want to summarize for each ID that which variable flagged as 1, AND the list has to based on the order of frequencies of each Variable (who has the most 1's down to the least 1's). It is easy to manually reorder columns then concatenate all to my results column, but I have thousands rows and hundreds columns to reorder. I have to find a way rather than manual... dying here. please help!  

1 ACCEPTED SOLUTION

Accepted Solutions
jthi
Super User

Re: Ordering columns based on the frequency not names...

This post should most likely be in Discussions not in Community Discussions area.

 

Most likely there are better ways to do this, but here is scripting heavy solution which seemed to work with example data:

Names Default To Here(1);

//Example datatable

dt = New Table("Q OF THE DAY",
	Add Rows(5),
	New Column("ID", Character(1), "Nominal", Set Values({"A", "B", "C", "D", "E"}), Set Display Width(38)),
	New Column("V1", Numeric, "Continuous", Format("Best", 12), Set Values([1, 0, 1, 0, 0]), Set Display Width(38)),
	New Column("V2", Numeric, "Continuous", Format("Best", 12), Set Values([1, 1, 1, 0, 0]), Set Display Width(38)),
	New Column("V3", Numeric, "Continuous", Format("Best", 12), Set Values([0, 0, 1, 0, 0]), Set Display Width(38)),
	New Column("V4", Numeric, "Continuous", Format("Best", 12), Set Values([1, 1, 1, 0, 1]), Set Display Width(42)),
	New Column("V6", Numeric, "Continuous", Format("Best", 12), Set Values([0, 0, 1, 0, 1]))
);

/*
colCount = 10;
dt = new table("example", 
	add rows(1000),
	New Column("ID", Numeric, Ordinal, << Set Each Value(Row()));
);

For(i = 1, i <= colCount, i++,
	dt << New Column("V" || char(i), Numeric, Ordinal,
		<< Set Each Value(Random Binomial(1, 0.9))
	);
);
*/

colNames = dt << Get Column Names("String");

//Drop ID column 
colNames = Substr(colNames, 2);

//sorting list https://community.jmp.com/t5/JSL-Cookbook/Sorting-Lists/ta-p/240795
//create ranking based on sums and column idx
sumsOfCols = V Sum(dt[0, colNames]); //order is same as in coLNames
orderCols = Reverse(Rank(sumsOfCols));

newCol = dt << New Column("ORDERED", Character, Nominal);
For Each Row(dt,
	//Indices of columns which have 1
	idxWithOne = Loc(dt[Row(), colNames]);
	//if all values are missing continue to next row
	If(N Items(idxWithOne) == 0,
		continue();
	);
	//fill list with empty values
	tempList = Repeat({""}, N Items(colNames));
	//replace empty with colnames which had 1
	tempList[idxWithOne] = colNames[idxWithOne];
	//reorder list based on sum ranking
	reOrderedList = tempList[orderCols];
	//create string of list and remove whitespaces
	orderedString = Trim Whitespace(Collapse Whitespace(Concat Items(tempList[orderCols], " ")));
	//add value to column
	newCol[Row()] = orderedString;
);

Also it has no specific handling for ties.

jthi_0-1634835308987.png

 

You might be also to  do this interactively:

Stack data:

jthi_1-1634836504734.png

Add formulas:

	New Column("Orders",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Formula(Col Sum(:Data, :Label))
	),
	New Column("RankForId",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Formula(Col Rank(Col Max(:Orders) - :Orders, :ID)),
		Set Display Width(70)
	),
	New Column("ColumnName",
		Character,
		"Nominal",
		Formula(If(:Data == 1, :Label))
	)

Sort data by ID and RankForId (select both and sort ascending):

jthi_2-1634836569135.png

Transpose data:

jthi_3-1634836578318.png

jthi_4-1634836595232.png

 

Remove Label column and Combine Row columns from Columns menu:

jthi_5-1634836631683.png

Remove unnecessary columns and join back to original datatable if needed.

 

Interactive solution might have different ordering between ties that the scripting version

 

 

 

-Jarmo

View solution in original post

2 REPLIES 2
jthi
Super User

Re: Ordering columns based on the frequency not names...

This post should most likely be in Discussions not in Community Discussions area.

 

Most likely there are better ways to do this, but here is scripting heavy solution which seemed to work with example data:

Names Default To Here(1);

//Example datatable

dt = New Table("Q OF THE DAY",
	Add Rows(5),
	New Column("ID", Character(1), "Nominal", Set Values({"A", "B", "C", "D", "E"}), Set Display Width(38)),
	New Column("V1", Numeric, "Continuous", Format("Best", 12), Set Values([1, 0, 1, 0, 0]), Set Display Width(38)),
	New Column("V2", Numeric, "Continuous", Format("Best", 12), Set Values([1, 1, 1, 0, 0]), Set Display Width(38)),
	New Column("V3", Numeric, "Continuous", Format("Best", 12), Set Values([0, 0, 1, 0, 0]), Set Display Width(38)),
	New Column("V4", Numeric, "Continuous", Format("Best", 12), Set Values([1, 1, 1, 0, 1]), Set Display Width(42)),
	New Column("V6", Numeric, "Continuous", Format("Best", 12), Set Values([0, 0, 1, 0, 1]))
);

/*
colCount = 10;
dt = new table("example", 
	add rows(1000),
	New Column("ID", Numeric, Ordinal, << Set Each Value(Row()));
);

For(i = 1, i <= colCount, i++,
	dt << New Column("V" || char(i), Numeric, Ordinal,
		<< Set Each Value(Random Binomial(1, 0.9))
	);
);
*/

colNames = dt << Get Column Names("String");

//Drop ID column 
colNames = Substr(colNames, 2);

//sorting list https://community.jmp.com/t5/JSL-Cookbook/Sorting-Lists/ta-p/240795
//create ranking based on sums and column idx
sumsOfCols = V Sum(dt[0, colNames]); //order is same as in coLNames
orderCols = Reverse(Rank(sumsOfCols));

newCol = dt << New Column("ORDERED", Character, Nominal);
For Each Row(dt,
	//Indices of columns which have 1
	idxWithOne = Loc(dt[Row(), colNames]);
	//if all values are missing continue to next row
	If(N Items(idxWithOne) == 0,
		continue();
	);
	//fill list with empty values
	tempList = Repeat({""}, N Items(colNames));
	//replace empty with colnames which had 1
	tempList[idxWithOne] = colNames[idxWithOne];
	//reorder list based on sum ranking
	reOrderedList = tempList[orderCols];
	//create string of list and remove whitespaces
	orderedString = Trim Whitespace(Collapse Whitespace(Concat Items(tempList[orderCols], " ")));
	//add value to column
	newCol[Row()] = orderedString;
);

Also it has no specific handling for ties.

jthi_0-1634835308987.png

 

You might be also to  do this interactively:

Stack data:

jthi_1-1634836504734.png

Add formulas:

	New Column("Orders",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Formula(Col Sum(:Data, :Label))
	),
	New Column("RankForId",
		Numeric,
		"Continuous",
		Format("Best", 12),
		Formula(Col Rank(Col Max(:Orders) - :Orders, :ID)),
		Set Display Width(70)
	),
	New Column("ColumnName",
		Character,
		"Nominal",
		Formula(If(:Data == 1, :Label))
	)

Sort data by ID and RankForId (select both and sort ascending):

jthi_2-1634836569135.png

Transpose data:

jthi_3-1634836578318.png

jthi_4-1634836595232.png

 

Remove Label column and Combine Row columns from Columns menu:

jthi_5-1634836631683.png

Remove unnecessary columns and join back to original datatable if needed.

 

Interactive solution might have different ordering between ties that the scripting version

 

 

 

-Jarmo
Isabel26
Level III

Re: Ordering columns based on the frequency not names...

thanks so much! this is great!