cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
JMP is taking Discovery online, April 16 and 18. Register today and join us for interactive sessions featuring popular presentation topics, networking, and discussions with the experts.
Choose Language Hide Translation Bar

Is there a way to extract columns used in a formula?

Hello folks-

I'm trying to create a new button in my platform which would extract selected columns from a step-wise analysis and then build up univariate distributions of each selection. The limitation is that the extraction of columns selected and building the distribution is held to the specific column names in the list. I would like to be able to dive into the column to check if it is a formula, extract any columns used if it is, and then turn around and include the extra columns into the list of selected columns from the step-wise.

More specifically, can I extract columns from a formula? Using the ":" piece doesn't seem like it will capture complex names and the index only refers to just grabbing the whole formula.

An example would be having my "Volume" column in a list:

{:Volume}

Doing column("Volume")<<Get Formula; returns:

:Height * :Width * :Length

I want to end with a list of:
{:Height, :Width, :Length, :Volume}

This is a simple formula example, there would be other versions/types with simple or complex names...

Thanks,

Andrew Parker

1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: Is there a way to extract columns used in a formula?

Ian - recursive version here

Craige

View solution in original post

5 REPLIES 5
ian_jmp
Staff

Re: Is there a way to extract columns used in a formula?

The best way to do this would be to recursively tokenise the formula expression into a tree, then scan the leaf nodes for column references. This is perfectly possible in JSL using the expression manipulation primitives provided (look in the Section 'Lists and Expressions' within the JMP Scripting Guide, paying attention to the 'Head()', 'NArg()' and 'Arg()' functions). But you might need to dig out a computer science book to code up this method.

Alternatively, the code below offers a kludgy way to get JMP to do the work for you (using the profiler):

NamesDefaultToHere(1);

dt = New Table( "Get Columns in Formula",

  Add Rows( 20 ),

  New Column( "H",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( Random Uniform() )

  ),

  New Column( "W",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( Random Uniform() )

  ),

  New Column( "D",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( Random Uniform() )

  ),

  New Column( "V",

  Numeric,

  "Continuous",

  Format( "Best", 12 ),

  Formula( (:W * :H * :D) / 10 )

  )

  );

// Launch the Profiler invisibly

p = dt << Profiler(Y( :V ), Expand, Invisible);

// Get the grid table

p << OutputGridTable;

// Get the columns in the grid table into a list

cols = DataTable(NTable()) << getColumnNames("String");

// Clean up

p << CloseWindow;

Close(DataTable(NTable()), NoSave);

Craige_Hales
Super User

Re: Is there a way to extract columns used in a formula?

Ian - recursive version here

Craige
ian_jmp
Staff

Re: Is there a way to extract columns used in a formula?

Nice code! Many thanks.

Re: Is there a way to extract columns used in a formula?

Thanks! This definitely appears to be what I need to complete my program.

To check my understanding of the formula, if I were visualize what's happening using the formula builder:

8477_pastedImage_0.png

It looks at the main box and asks 'what's the formula?' and 'how many pieces are required?' to which it looks at each piece and repeats the question until it hits a box with only leaves inside and returns the string for whichever leaves are a column name . Once the smallest leaf box is achieved, it starts to back out and look at the nodes/leaves it hadn't checked on its path to the first leaf box...

Not super eloquent in description, but does that sound right?

ian_jmp
Staff

Re: Is there a way to extract columns used in a formula?

I think that's a reasonable paraphrase of Craige@JMP entry above, Andrew.

Note, though, that your example immediately above shows that you will need to keep only the distinct items in the list (so 'W' occurs only once). You could put this logic in Craige's function, or just manipulate the list that his function returns.