Subscribe Bookmark RSS Feed

Is there a way to extract columns used in a formula?

andrew_j_parker

Community Trekker

Joined:

Aug 26, 2014

Hello folks-

I'm trying to create a new button in my platform which would extract selected columns from a step-wise analysis and then build up univariate distributions of each selection. The limitation is that the extraction of columns selected and building the distribution is held to the specific column names in the list. I would like to be able to dive into the column to check if it is a formula, extract any columns used if it is, and then turn around and include the extra columns into the list of selected columns from the step-wise.

More specifically, can I extract columns from a formula? Using the ":" piece doesn't seem like it will capture complex names and the index only refers to just grabbing the whole formula.

An example would be having my "Volume" column in a list:

{:Volume}

Doing column("Volume")<<Get Formula; returns:

:Height * :Width * :Length

I want to end with a list of:
{:Height, :Width, :Length, :Volume}

This is a simple formula example, there would be other versions/types with simple or complex names...

Thanks,

Andrew Parker

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales

Staff

Joined:

Mar 21, 2013

Solution

Ian - recursive version here

Craige
5 REPLIES
ian_jmp

Staff

Joined:

Jun 23, 2011

The best way to do this would be to recursively tokenise the formula expression into a tree, then scan the leaf nodes for column references. This is perfectly possible in JSL using the expression manipulation primitives provided (look in the Section 'Lists and Expressions' within the JMP Scripting Guide, paying attention to the 'Head()', 'NArg()' and 'Arg()' functions). But you might need to dig out a computer science book to code up this method.

Alternatively, the code below offers a kludgy way to get JMP to do the work for you (using the profiler):

NamesDefaultToHere(1);

dt = New Table( "Get Columns in Formula",

  Add Rows( 20 ),

  New Column( "H",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( Random Uniform() )

  ),

  New Column( "W",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( Random Uniform() )

  ),

  New Column( "D",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( Random Uniform() )

  ),

  New Column( "V",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( (:W * :H * :D) / 10 )

  )

  );

// Launch the Profiler invisibly

p = dt << Profiler(Y( :V ), Expand, Invisible);

// Get the grid table

p << OutputGridTable;

// Get the columns in the grid table into a list

cols = DataTable(NTable()) << getColumnNames("String");

// Clean up

p << CloseWindow;

Close(DataTable(NTable()), NoSave);

Craige_Hales

Staff

Joined:

Mar 21, 2013

Solution

Ian - recursive version here

Craige
ian_jmp

Staff

Joined:

Jun 23, 2011

Nice code! Many thanks.

andrew_j_parker

Community Trekker

Joined:

Aug 26, 2014

Thanks! This definitely appears to be what I need to complete my program.

To check my understanding of the formula, if I were visualize what's happening using the formula builder:

8477_pastedImage_0.png

It looks at the main box and asks 'what's the formula?' and 'how many pieces are required?' to which it looks at each piece and repeats the question until it hits a box with only leaves inside and returns the string for whichever leaves are a column name . Once the smallest leaf box is achieved, it starts to back out and look at the nodes/leaves it hadn't checked on its path to the first leaf box...

Not super eloquent in description, but does that sound right?

ian_jmp

Staff

Joined:

Jun 23, 2011

I think that's a reasonable paraphrase of Craige@JMP entry above, Andrew.

Note, though, that your example immediately above shows that you will need to keep only the distinct items in the list (so 'W' occurs only once). You could put this logic in Craige's function, or just manipulate the list that his function returns.