cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Register for our Discovery Summit 2024 conference, Oct. 21-24, where you’ll learn, connect, and be inspired.
Choose Language Hide Translation Bar
doraemengs
Level III

getting column name with specific cut-off values

Hi,

I try to count and get specific column names from these data examples (my real data contained 180 columns and 20000 rows)

I want to select the column that cut-off > 1000 

 

No.A*1:01A*2:01B*9:02C*10:01DR*56:01DP*07_GGet name
110.6709520.2315000.190.5640.89B*9:02, C*10:01
2000009.56 
39040.3345.3345.9999.934510.5A*1:01
434.4456645700A*2:01
59190.2135673456678990880A*1:01,A*2:01,B*9:02,C*10:01,DR*56:01
6000000 
...       

I try the previous script like this 

Names Default To Here( 1 );

// Create a sample data table
dt = New Table( "Example",
	Add Rows( 2 ),
	New Column( "A*1:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "A*2:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "B*9:02", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "C*10:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "DR*56:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
//get wanted columns. Assumes that IMC columns start from column 1 and all IMC columns are one after another
colNames = dt << Get Column Names("String");
For(i = N Items(colNames), i > 3000 , i--,
);

//new column with IMC with value 1
dt << New Column( "Ag", Character, 
	<< Set Each Value(
		rowValues = dt[Row(), 0][1::N Items(colNames)]; //get values of row as matrix
		colIndex = Contains(rowValues, 1); //get index of
		If(colIndex > 0, //handle 
			colNames[colIndex], //get column name with the index
			""; //when no 1 found use ""
		);
	);
);

Could you please to correct my scripts, Thank you!

In addition, can I use matrix because of I need to generate the "new column" of 180 column. 

 

Best, 

15 REPLIES 15
doraemengs
Level III

Re: getting column name with specific cut-off values

Screenshot 2024-08-13 at 2.11.43 PM.png

 Thank you so much, I would like to clarify this. I have another column of interest as "alphabet of interest" and I want to get name and value of all columns with values> 1,000 and all columns match with the column alphabet of interest.

 

Thank you!

 

 

jthi
Super User

Re: getting column name with specific cut-off values

Most likely doing something like Jim suggested is faster and easier, but you can do something like this with the method I did demonstrate

Names Default To Here(1);

dt = Open("$DOWNLOADS/Examples_get column name_by cutoff.jmp", invisible);

Summarize(dt, alphs = By(:"Alphabet of interest"n));
aa = Associative Array(alphs, Repeat({{}}, N Items(alphs)));

collist = dt << Get Column Names("String");
For Each({colname}, collist,
	alph = Word(1, colname, "*");
	If(Contains(aa, alph),
		Insert Into(aa[alph], colname);
		Column(dt, colname) << Set Data Type("Numeric") << Modeling Type("Continuous");
	);
);


str_template = "^col^ (^val^)";
newcol = dt << New Column("getname", Character, Nominal, Formula(
	cols = aa[:"Alphabet of interest"n];
	vals = dt[Row(), aa[:"Alphabet of interest"n]];
	valid_idx = Loc(vals > 1000);
	
	If(N Items(valid_idx) > 0,
		res = {};
		For Each({{col, val}}, Across(cols[valid_idx], vals[valid_idx]),
			Insert Into(res, Eval Insert(str_template));
		);
		Concat Items(res, ", ");
	,
		""
	);
));

dt << run formulas;
newcol << delete formula;

dt << show window(1);

jthi_0-1723574578991.png

-Jarmo
doraemengs
Level III

Re: getting column name with specific cut-off values

Thank you so much. This script does work. However, I am sorry that my original data was complicated more than this. The Alphabet of interest has multiple "alphabet" and "numbers" separated with "," (A1, B9) as attached files. How can I get name and value column that correlated alphabet of interest column and column name. 

Screenshot 2024-08-13 at 4.12.46 PM.png

Thank you for your help.

 

txnelson
Super User

Re: getting column name with specific cut-off values

Here is an alternative method that may help with a large number of columns

Names Default To Here( 1 );

dt = Open("$DOWNLOADS/Examples_get column name_by cutoff.jmp");

colnames = Remove( dt << Get Column Names( "String" ), 1 ); // drop No column

// Fix data and modeling types
For Each({colname}, colnames,
	Column(dt, colname) << Data Type("Numeric") << Modeling Type("Continuous");
);

dtStack = dt << Stack(
	columns( colnames ),
	Source Label Column( "ColName" ),
	Stacked Data Column( "Value" ),
	Output Table( "The Stack" )
);

dtStack << select where( :value <= 1000 );
dtStack << delete rows;
dtStack << delete columns( Value );

dtSplit = dtStack << Split(
	Split By( :ColName ),
	Split( :ColName ),
	Group( :No ),
	Output Table( "Split of The Stack by ColName" ),
	Sort by Column Property
);

close( dtStack, nosave );
colnames = Remove( dtSplit << Get Column Names( "String" ), 1 );
dtSplit << Combine Columns(
	delimiter( "," ),
	columns( colnames ),
	Column Name( "Get Name" )
);
dtSplit << delete columns( colnames );
dt << Update( With( dtSplit ), Match Columns( :No = :No ) );

Close( dtSplit, nosave );
Jim
doraemengs
Level III

Re: getting column name with specific cut-off values

Thank you, Jim, but it does not work. please see the attached file. 

txnelson
Super User

Re: getting column name with specific cut-off values

If the script was written to run with a column called "No" followed by numerical columns.  Your supplied data table has 2 character columns in addition to the numeric.  If you remove the columns "Alphabet of interest" and "get name and value" and then run the script, it works fine.

You should be able to easily change the script to compensate for the additional columns.  Either deleting them in the script before beginning the getting the colnames retrieved, or else deleting them from the colnames list before processing the For Each should work. 

Jim