cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
poulravn
Level IV

More on subset creation

Hi, I am referring to the post https://communities.sas.com/message/166376#166376 from May this year.

The example explains how to create a new data table by 1) first adding af data filter, and 2) then subsetting:

dt = Open ( BigClass);

dt << data filter (add filter(columns(:age),where(:age=={12,13}), Mode (select(1)));

sub1 = dt<<subset(output table name("regions"),selected rows(1), selected columns (0));

As  result a data table, named "regions" is created. As expected.

I have created a similar example using my own data set (the BigClass example works!!), which does not work well for me. I suspect I have a misunderstand of something. Here is my program flow attached.

The difference between the BigClass example and my example is, that the data filter does not get selected, when I run the whole script. (I omitted the loading of the file from an Excel file to be able to send you the original code, which had certain names in it). To explain: There is a character column, from which I wnat to select only those rows where the text in the column starts with "d" or "I". I found it necessary to create a new column on which to perform the data filter. When I run the code in total, the rows do NOT get selected and the subsetted data table does not get created. If I first run the code UNTIL the creation of the new column, Onelettet, and THEN run the data filter, then the filter is created and the rows are selected. And finally the subset data table is created. But these things should happen either way how the script is run? OR?

This is fundamental to the use of scripting, so I need to get a deep understanding of these subtleties.

Regards

Poul

1 ACCEPTED SOLUTION

Accepted Solutions
ms
Super User (Alumni) ms
Super User (Alumni)

Re: More on subset creation

It is difficult to tell why it not works in a particular case without a table for testing the code. If code depends on column formulas created in the same script, I have experienced that formula evaluation can lag behind the script execution which can lead to unexpected results. To avoid this one can try to insert Wait() or << eval formula at the critical steps.

However in this particular case, if I understand it correctly, there is no need for a new column (or a data filter). Perhaps the below example can be of help.

//example table

dt = New Table( "test",

  Add Rows( 5 ),

  New Column( "Instr", Character,

  Set Values( {"drum", "trumpet", "lyre", "clarinet", "didgeridoo"} )

  )

);

// Select instr starting with d or l using a data filter

sel = dt << get rows where( Starts With( :instr, "d" ) | Starts With( :instr, "l" ) );

dt << data filter(

  add filter( columns( :instr ), where( :instr = :instr[sel] ), mode( select( 1 ) ) )

);

dt << subset( output table name( "sel instr" ), Selected Rows( 1 ), selected columns( 0 ) );

// Or a more direct approach without data filter

dt << select where( Starts With( :instr, "d" ) | Starts With( :instr, "l" ) );

dt << subset( output table name( "sel instr" ), Selected Rows( 1 ), selected columns( 0 ) );

View solution in original post

4 REPLIES 4
ms
Super User (Alumni) ms
Super User (Alumni)

Re: More on subset creation

It is difficult to tell why it not works in a particular case without a table for testing the code. If code depends on column formulas created in the same script, I have experienced that formula evaluation can lag behind the script execution which can lead to unexpected results. To avoid this one can try to insert Wait() or << eval formula at the critical steps.

However in this particular case, if I understand it correctly, there is no need for a new column (or a data filter). Perhaps the below example can be of help.

//example table

dt = New Table( "test",

  Add Rows( 5 ),

  New Column( "Instr", Character,

  Set Values( {"drum", "trumpet", "lyre", "clarinet", "didgeridoo"} )

  )

);

// Select instr starting with d or l using a data filter

sel = dt << get rows where( Starts With( :instr, "d" ) | Starts With( :instr, "l" ) );

dt << data filter(

  add filter( columns( :instr ), where( :instr = :instr[sel] ), mode( select( 1 ) ) )

);

dt << subset( output table name( "sel instr" ), Selected Rows( 1 ), selected columns( 0 ) );

// Or a more direct approach without data filter

dt << select where( Starts With( :instr, "d" ) | Starts With( :instr, "l" ) );

dt << subset( output table name( "sel instr" ), Selected Rows( 1 ), selected columns( 0 ) );

poulravn
Level IV

Re: More on subset creation

Hi MS,

adding wait(); after the formula evaluation solves the problem of the filter not getting set before the subset command is executed. Thank you.

What is the command for closing the filter window? I do not want it to stay on screen after the subset. I just want the data set. Actually I do not need to see it at all.

And yes, I thought the selection could be done in one statement. Just did not know how to do it.

Thank you

Poul

ms
Super User (Alumni) ms
Super User (Alumni)

Re: More on subset creation

Closing the data filter is simple if you have given it a reference.

df = dt << data filter( ... );

df << close;

wiebepo
Level III

Re: More on subset creation

Using MS's example code, here is a direct solution that is very robust, since it does not require the rows in the data table to switch to the selected state:

dt = New Table( "test",

  Add Rows( 5 ),

  New Column( "Instr", Character,

  Set Values( {"drum", "trumpet", "lyre", "clarinet", "didgeridoo"} )

  )

);

sel = dt << get rows where( Starts With( :instr, "d" ) | Starts With( :instr, "l" ) );

dt << subset( output table name( "sel instr" ),  Rows( sel ), selected columns( 0 ) );