cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
The Discovery Summit 2025 Call for Content is open! Submit an abstract today to present at our premier analytics conference.
Choose Language Hide Translation Bar
doraemengs
Level III

getting column name with specific cut-off values

Hi,

I try to count and get specific column names from these data examples (my real data contained 180 columns and 20000 rows)

I want to select the column that cut-off > 1000 

 

No.A*1:01A*2:01B*9:02C*10:01DR*56:01DP*07_GGet name
110.6709520.2315000.190.5640.89B*9:02, C*10:01
2000009.56 
39040.3345.3345.9999.934510.5A*1:01
434.4456645700A*2:01
59190.2135673456678990880A*1:01,A*2:01,B*9:02,C*10:01,DR*56:01
6000000 
...       

I try the previous script like this 

Names Default To Here( 1 );

// Create a sample data table
dt = New Table( "Example",
	Add Rows( 2 ),
	New Column( "A*1:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "A*2:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "B*9:02", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "C*10:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
	New Column( "DR*56:01", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [0, 50000] ) ),
//get wanted columns. Assumes that IMC columns start from column 1 and all IMC columns are one after another
colNames = dt << Get Column Names("String");
For(i = N Items(colNames), i > 3000 , i--,
);

//new column with IMC with value 1
dt << New Column( "Ag", Character, 
	<< Set Each Value(
		rowValues = dt[Row(), 0][1::N Items(colNames)]; //get values of row as matrix
		colIndex = Contains(rowValues, 1); //get index of
		If(colIndex > 0, //handle 
			colNames[colIndex], //get column name with the index
			""; //when no 1 found use ""
		);
	);
);

Could you please to correct my scripts, Thank you!

In addition, can I use matrix because of I need to generate the "new column" of 180 column. 

 

Best, 

3 ACCEPTED SOLUTIONS

Accepted Solutions
jthi
Super User

Re: getting column name with specific cut-off values

You will have to open your table, fix the columns and then run the part where new column is created

Names Default To Here(1);

dt = Open("$DOWNLOADS/Examples_get column name_by cutoff.jmp", invisible);

colnames = Remove(dt << Get Column Names("String"), 1); // drop No column

// Fix data and modeling types
For Each({colname}, colnames,
	Column(dt, colname) << Data Type("Numeric") << Modeling Type("Continuous");
);


newcol = dt << New Column("getname", Character, Nominal, Formula(
	valid_idx = Loc(dt[Row(), colnames] > 1000);
	If(N Items(valid_idx) > 0,
		Concat Items(colnames[valid_idx], ", ")
	,
		""
	);
));
dt << run formulas;
newcol << delete formula;

I strongly suggest you check out Scripting Guide from JMP Help

-Jarmo

View solution in original post

jthi
Super User

Re: getting column name with specific cut-off values

Most likely doing something like Jim suggested is faster and easier, but you can do something like this with the method I did demonstrate

Names Default To Here(1);

dt = Open("$DOWNLOADS/Examples_get column name_by cutoff.jmp", invisible);

Summarize(dt, alphs = By(:"Alphabet of interest"n));
aa = Associative Array(alphs, Repeat({{}}, N Items(alphs)));

collist = dt << Get Column Names("String");
For Each({colname}, collist,
	alph = Word(1, colname, "*");
	If(Contains(aa, alph),
		Insert Into(aa[alph], colname);
		Column(dt, colname) << Set Data Type("Numeric") << Modeling Type("Continuous");
	);
);


str_template = "^col^ (^val^)";
newcol = dt << New Column("getname", Character, Nominal, Formula(
	cols = aa[:"Alphabet of interest"n];
	vals = dt[Row(), aa[:"Alphabet of interest"n]];
	valid_idx = Loc(vals > 1000);
	
	If(N Items(valid_idx) > 0,
		res = {};
		For Each({{col, val}}, Across(cols[valid_idx], vals[valid_idx]),
			Insert Into(res, Eval Insert(str_template));
		);
		Concat Items(res, ", ");
	,
		""
	);
));

dt << run formulas;
newcol << delete formula;

dt << show window(1);

jthi_0-1723574578991.png

-Jarmo

View solution in original post

jthi
Super User

Re: getting column name with specific cut-off values

Here is some sort of a script, I'm still left with guess work for the correct rules so no idea if it does what it is supposed to do

Names Default To Here(1);

dt = Open("$DOWNLOADS/Examples_get column name_edit.jmp");

// fix data
For Each({colname}, dt << Get Column Names("Character", "String"),
	If(colname == "Alphabet",
		break();
	);
	Column(dt, colname) << Set Data Type("Numeric") << Set Modeling Type("Continuous");
);
dt << Delete Columns("get name and value");
dt << New Column("R", Numeric, Ordinal, Formula(Row()));

dt_stacked = dt << Stack(
	columns(
		:"DOA1*01:01,DOB1*01:01_DO1"n, :"DOA1*02:01,DOB1*01:01_DO1"n,
		:"DOA1*03:01,DOB1*01:01_DO1"n, :"DOA1*02:01,DOB1*02:01_DO2"n,
		:"DOA1*01:05,DOB1*02:01_DO2"n, :"DOA1*04:03,DOB1*03:01_DO3"n,
		:"DOA1*06:01,DOB1*04:01_DO4"n, :"DOA1*05:01,DOB1*03:01_DO7"n
	),
	Source Label Column("Label"),
	Stacked Data Column("Data"),
	Output Table("stacked")
);

rows_to_delete = dt_stacked << Get Rows Where(:Data <= 1000);
dt_stacked << Delete Rows(rows_to_delete);
dt_stacked << New Column("Match", Numeric, Nominal);

For Each Row(dt_stacked,
	alphs = Words(:Alphabet, ",");
	temp_label = Regex(:Label, "(\d\*)", "", GLOBALREPLACE);
	retval = 0;
	For Each({alph}, alphs,
		If(Contains(temp_label, alph),
			retval = 1;
			break();
		);
	);
	:Match = retval;
);

rows_to_delete = dt_stacked << Get Rows Where(:Match == 0);
dt_stacked << Delete Rows(rows_to_delete);

dt_stacked << New Column("NewStr", Character, Nominal, Formula(
	:Label || " (" || Char(:Data) || ")"
));

Summarize(dt_stacked, rs = By(:R));
fixed_rs = Transform Each({r}, rs, Num(r));
vals = Associative Array();

For Each({r}, fixed_rs,
	t = dt_stacked << Get Rows Where(:R == r);
	vals[r] = Concat Items(dt_stacked[t, "NewStr"], ",");
);
Close(dt_ stacked);

new_col = dt << New Column("Res", Character, Nominal, Formula(
	If(Contains(vals, :R),
		vals[:R]
	,
		""
	);
	
));
dt << run formulas;
new_col << Delete Formula;

 

-Jarmo

View solution in original post

25 REPLIES 25
txnelson
Super User

Re: getting column name with specific cut-off values

Here is how I would handle this

txnelson_0-1718733620444.png

names default to here(1);
dt = New Table( "Example",
	Add Rows( 6 ),
	New Column( "A*1:01",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values(
			[10.67, 0, 9040.2999999999993, 34.399999999999999, 9190.2000000000007, 0
			]
		)
	),
	New Column( "A*2:01",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [0, 0, 345.30000000000001, 4566, 13567, 0] )
	),
	New Column( "B*9:02",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [9520.2299999999996, 0, 345.89999999999998, 45, 3456, 0] )
	),
	New Column( "C*10:01",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [15000.1, 0, 999.89999999999998, 7, 6789, 0] )
	),
	New Column( "DR*56:01",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [90.560000000000002, 0, 345, 0, 9088, 0] )
	),
	New Column( "DP*07_G",
		Numeric,
		"Continuous",
		Format( "Best", 12 ),
		Set Values( [40.890000000000001, 9.5600000000000005, 10.5, 0, 0, 0] )
	)
);

dt << New Column( "Get Name",character);

colNames = dt << get column names( continuous,string );

For Each Row(
	theList = {};
	For Each( {col, i}, colNames,
		If( As Column( col ) > 3000,
			Insert Into( theList, col )
		)
	);
	:Get Name = Concat Items( theList, "," );
);

Jim
doraemengs
Level III

Re: getting column name with specific cut-off values

Hi Jim,

 

Thank you for your help. However, I have 180 columns and 20000 rows in difference words and I cannot put all of them as following scripts. Could you please provide me with more suggestions?

 

Thank you

jthi
Super User

Re: getting column name with specific cut-off values

This might be fast enough

Names Default To Here(1);

// Create a sample data table
dt = New Table("Example",
	Add Rows(20000),
	invisible
);

colcount = 200;
For(i = 1, i <= colcount, i++,
	newcol = dt << New Column(Char(i), Numeric, Continuous, Formula(
		Random Normal() * 1000
	));
	dt << run formulas;
	newcol << delete formula;
);

start = Tick Seconds();
colnames = dt << Get Column Names("String");
newcol = dt << New Column("getname", Character, Nominal, Formula(
	valid_idx = Loc(dt[Row(), colnames] > 1000);
	If(N Items(valid_idx) > 0,
		Concat Items(colnames[valid_idx], ", ")
	,
		""
	);
));
dt << run formulas;
newcol << delete formula;

wait(0);
end = Tick Seconds();
show(end - start);
-Jarmo
doraemengs
Level III

Re: getting column name with specific cut-off values

Hi Jarmo,

 

Thank you so much. But the result is "end - start = 0.0333333333255723;" That's not the new table. could you please provide more solutions? please show me the following data table on JMP. 

 

Thank you

jthi
Super User

Re: getting column name with specific cut-off values

My script creates new column "getname" with the values

jthi_0-1718737059176.png

 

With your data, after you fix your columns to numeric (and drop No column)

jthi_1-1718737144803.png

 

-Jarmo
doraemengs
Level III

Re: getting column name with specific cut-off values

Hi Jarmo,

 

Thank you for your help but to be honest as in the following attachment, I cannot run the script as you did. when I checked the script (command + 1) shown below. Thank you for your help.

dt = New Table("Example",
	Add Rows(20000),
	invisible
);

colcount = 200;
For(i = 1, i <= colcount, i++,
	newcol = dt << New Column(Char(i), Numeric, Continuous, Formula(
		Random Normal() * 1000
	));
	dt << run formulas;
	newcol << delete formula;
);

start = Tick Seconds();
colnames = dt << Get Column Names("String");
newcol = dt << New Column("getname", Character, Nominal, Formula(
	valid_idx = Loc(dt[Row(), colnames] > 1000);
	If(N Items(valid_idx) > 0,
		Concat Items(colnames[valid_idx], ", ")
	,
		""
	);
));
dt << run formulas;
newcol << delete formula;

wait(0);
end = Tick Seconds();
show(end - start);
/*:

end - start = 0.683333333348855;

 

jthi
Super User

Re: getting column name with specific cut-off values

You will have to open your table, fix the columns and then run the part where new column is created

Names Default To Here(1);

dt = Open("$DOWNLOADS/Examples_get column name_by cutoff.jmp", invisible);

colnames = Remove(dt << Get Column Names("String"), 1); // drop No column

// Fix data and modeling types
For Each({colname}, colnames,
	Column(dt, colname) << Data Type("Numeric") << Modeling Type("Continuous");
);


newcol = dt << New Column("getname", Character, Nominal, Formula(
	valid_idx = Loc(dt[Row(), colnames] > 1000);
	If(N Items(valid_idx) > 0,
		Concat Items(colnames[valid_idx], ", ")
	,
		""
	);
));
dt << run formulas;
newcol << delete formula;

I strongly suggest you check out Scripting Guide from JMP Help

-Jarmo
doraemengs
Level III

Re: getting column name with specific cut-off values

Hi Jarmo, Do you have any idea for extracting the values of this that are more than the cutoff (if I have 50 columns) with each column name (getname column on an examples)

jthi
Super User

Re: getting column name with specific cut-off values

How do you wish to extract them? Doesn't my solution create new column which has the column names as strings separated by , and space?

jthi_0-1723564823980.png

Do you wish to check values by column instead of by row? Or something else?

 

Edit:

If you just want different column list, I would have to know how to filter it. If you just want all columns, remove the Remove() function

Names Default To Here(1);

dt = Open("$DOWNLOADS/Examples_get column name_by cutoff.jmp", invisible);

colnames = dt << Get Column Names("String");
For Each({colname}, colnames,
	Column(dt, colname) << Data Type("Numeric") << Modeling Type("Continuous");
);


For(i = 1, i < 100, i++,
	newcol = dt << New Column(Hex(i), Numeric, Nominal, Values(J(1, N Rows(dt), Random Normal()*1000)));
);

colnames = dt << Get Column Names("String");

newcol = dt << New Column("getname", Character, Nominal, Formula(
	valid_idx = Loc(dt[Row(), colnames] > 1000);
	If(N Items(valid_idx) > 0,
		Concat Items(colnames[valid_idx], ", ")
	,
		""
	);
));
dt << run formulas;
newcol << delete formula;
-Jarmo