Subscribe Bookmark RSS Feed

Stack columns based on regex selection

tsandidge

Community Trekker

Joined:

Mar 17, 2016

I am constantly stacking hundreds of columns which have common prefixes and end with a different number or suffix.

What I would like to do is write a script to:

  • Find columns based on a regex pattern, e.g. "BLUE_CARS_*_\d{1,2}", "HOMES_(SINGLE|DOUBLE)_WIDE", etc.
  • Select those columns
  • Stack the columns (and place in a new datasheet)

For the life of me, I can't find (or Google search properly for) how to use regex with column selection.

I considered selecting the columns by data type, but there are other columns present which share that data type, not to

mention I want something much more flexible.

Any suggestions or code snippets would be immensely appreciated, and thank you in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Solution

Something like this.

dt = Open( "$sample_data/big class.jmp" );

allCols = dt << getcolumnnames;

Show( allCols ); // these are names, not strings

acols = {}; // columns with "a" in their name

icols = {}; // columns with "i" in their name

n = N Items( allCols );

For( k = 1, k <= n, k++,

  cname = allCols << getname; // these are strings, good for regex

  Show( cname );

  If( !Is Missing( Regex( cname, "a" ) ),

  Insert Into( acols, allCols ) // keep the names

  );

  If( !Is Missing( Regex( cname, "i" ) ),

  Insert Into( icols, allCols ) // keep the names

  );

);

Show( acols, icols ); // {:name, :age} and {:height, :weight}

dt << Stack(

  columns( icols ),

  Source Label Column( "Label" ),

  Stacked Data Column( "Data" )

);

11154_stack.PNG

Craige
8 REPLIES
Solution

Something like this.

dt = Open( "$sample_data/big class.jmp" );

allCols = dt << getcolumnnames;

Show( allCols ); // these are names, not strings

acols = {}; // columns with "a" in their name

icols = {}; // columns with "i" in their name

n = N Items( allCols );

For( k = 1, k <= n, k++,

  cname = allCols << getname; // these are strings, good for regex

  Show( cname );

  If( !Is Missing( Regex( cname, "a" ) ),

  Insert Into( acols, allCols ) // keep the names

  );

  If( !Is Missing( Regex( cname, "i" ) ),

  Insert Into( icols, allCols ) // keep the names

  );

);

Show( acols, icols ); // {:name, :age} and {:height, :weight}

dt << Stack(

  columns( icols ),

  Source Label Column( "Label" ),

  Stacked Data Column( "Data" )

);

11154_stack.PNG

Craige
Craige_Hales

Staff

Joined:

Mar 21, 2013

You might have to turn off the translate checkbox at the top of the screen to see the JSL correctly.

Craige
tsandidge

Community Trekker

Joined:

Mar 17, 2016

Great! That's exactly what I needed.

I'm still getting familiar with the jsl syntax and the need to 'show' everything.

And for the Regex() function, I couldn't find any documentation on using that outside of a regular column function, so that's handy it works the same way. This will prove useful in so many ways.

Related, how would I make sure I have the new table/window selected to use in the rest of a script?

Also, I definitely had to turn off the translation, because it was messing up the code.

Thank you!

Craige_Hales

Staff

Joined:

Mar 21, 2013

welcome, thanks.  Use an explicit data table whenever possible:

dtstacked = dt << Stack(

  columns( icols ),

  Source Label Column( "Label" ),

  Stacked Data Column( "Data" )

);

dtstacked will reference the data table that results from dt<<stack.  Use the dtstacked variable to make sure commands go to that table.

Craige
Craige_Hales

Staff

Joined:

Mar 21, 2013

And yes, I left some show functions in the JSL because they are helpful debugging aids.  Take them out when you are done so the log isn't cluttered with old answers to old questions.

Craige

Craige
tsandidge

Community Trekker

Joined:

Mar 17, 2016

I'm encountering issues turning this code into a JMP add-in (with a button). The code works great when I run it from the script editor, but when I try to run it as an add-in, it gives the following complaint:

    Send Expects Scriptable Object in access or evaluation of 'Send' , nc <<  /*###*/get name/*###*/

For( i = 1, i <= N Items( nc ), i++,  cname = nc <<  /*###*/get name/*###*/;  model = nc << get modeling type;

If I change the first line to read, cname = dt << get name(nc); it moves on to complain about the next line.
I then change this line to read, model = dt << get modeling type(nc); and it moves on down to a sort() function, changing the error:
Send Expects Scriptable Object in access or evaluation of 'Send' , transposed <</*###*/sort( replace table, by( "Row 1" ), order( Descending ) ) /*###*/
and the code reads

transposed = transposed << /*###*/ sort(replace table, by("Row 1"), order(Descending))/*###*/;


Any ideas about how I can resolve this as well as make it portable (work in both the editor and as an add-in without a problem)? My apologies for how unclear this may seem.

(Wow, why is there no message preview button...).

Craige_Hales

Staff

Joined:

Mar 21, 2013

show( i, nc );

will show the current values of the two variables.  Slip that in after the "i++," increment in the loop.  nc *should* display as a list (in curly braces) of column names.  the <<getname message is intended for a column in this case.  The data table may also have a <<getname method, but that's probably not what you want, and the getname method probably ignores any argument you supply.

Check the log after you run it with the show command and see what's wrong.

Scriptable Objects in JMP are objects that understand messages.  A variable could hold a matrix, list, or number, which are not scriptable objects, or the variable could hold a displaybox, data table, or column, which are scriptable objects.  There are lots of other types too, some scriptable, some not.

My guess is the data table isn't explicitly chosen by the add-in, and the errors cascade from there.  You might need another show() before this one.

Craige

Craige
tsandidge

Community Trekker

Joined:

Mar 17, 2016

The show() and details about what was going on helped me see that we were actually telling it to use a column name, not the actual column–I suppose that makes sense since <code>nc = dt << get column names;</code> was used in the first place. So, instead of

<pre><code>cname = nc << get name;</code></pre>

we needed

<pre><code>cname = column(nc) << get name;</code></pre>

I guess by default <code>get column names()</code> returns the strings, not a list of the columns. Ah, well, it's working as expected now.

Thank you for the help!