- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Select columns with a "pattern"
Hello
This is a simple task but I didn't find an immediate way to do it.
I have a table with many columns (~5000); I need to select a subset of columns and then to extract a subset table.
The dumbest way to do it would be to pick each column from the column selector.
Actually, I could easily identify the column I need by applying to the column name a "pattern"
For example, let the column be the following:
AAA11
AAA12
ABB22
XXX99
CAB32
CAB99
I need to select the columns whose name includes a "B" and terminates with a "2"
They could be identified by a pattern such as "*B*2" , where "*" is a wildcard.
In the example, the pattern would select "ABB22" and "CAB32"
I did not find an easy way to do this.
thanks in advance
Emanuele
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Select columns with a "pattern"
Here is a simple script to come up with a list of columns that meets the criteria
Names Default To Here( 1 );
dt = Current Data Table();
colList = dt << get column names( string );
For( i = N Items( colList ), i >= 1, i--,
If( !(Contains( colList[i], "B" ) & Right( colList[i], 1 ) == "2"),
Remove From( colList, i, 1 )
)
);
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Select columns with a "pattern"
If you really need regex and have a recent version of JMP that includes FilterEach:
dt = Open( "$sample_data/big class.jmp" );
colList = dt << get column names( string );
// (>?.*?) is an atomic match for a reluctant (not greedy) match of 0 or more characters
// atomic is important when the match might fail to keep if from slowing down a lot on long names
Filter Each( {name}, colList, !Is Missing( Regex( name, "^((.e.)|((>?.*?)a(>?.*?)e.*))$" ) ) );
{"name", "age", "sex"}
Regex, as always, is more fun to write than read. Be careful with multiple .* in a single regex that might not match...it might try a lot of fruitless alternatives.