cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
EV1976
Level I

Select columns with a "pattern"

Hello
This is a simple task but I didn't find an immediate way to do it.

I have a table with many columns (~5000); I need to select a subset of columns and then to extract a subset table.

The dumbest way to do it would be to pick each column from the column selector.

Actually, I could easily identify the column I need by applying to the column name a "pattern"

For example, let the column be the following:

AAA11 

AAA12
ABB22

XXX99

CAB32

CAB99

I need to select the columns whose name includes a "B" and terminates with a "2"

They could be identified by a pattern such as "*B*2" , where "*" is a wildcard.

In the example, the pattern would select "ABB22" and "CAB32"

I did not find  an easy way to do this.

thanks in advance

 

Emanuele

 

 

 

2 REPLIES 2
txnelson
Super User

Re: Select columns with a "pattern"

Here is a simple script to come up with a list of columns that meets the criteria

Names Default To Here( 1 );
dt = Current Data Table();
colList = dt << get column names( string );
For( i = N Items( colList ), i >= 1, i--,
	If( !(Contains( colList[i], "B" ) & Right( colList[i], 1 ) == "2"),
		Remove From( colList, i, 1 )
	)
);
Jim
Craige_Hales
Super User

Re: Select columns with a "pattern"

If you really need regex and have a recent version of JMP that includes FilterEach:

dt = Open( "$sample_data/big class.jmp" );
colList = dt << get column names( string );
// (>?.*?) is an atomic match for a reluctant (not greedy) match of 0 or more characters
// atomic is important when the match might fail to keep if from slowing down a lot on long names
Filter Each( {name}, colList, !Is Missing( Regex( name, "^((.e.)|((>?.*?)a(>?.*?)e.*))$" ) ) );

{"name", "age", "sex"}

Regex, as always, is more fun to write than read. Be careful with multiple .* in a single regex that might not match...it might try a lot of fruitless alternatives.

 

Craige