BookmarkSubscribe
Choose Language Hide Translation Bar
Chris_Rodrigues
Community Trekker

Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

I have a data table with approximately 100,000 rows and 2000 columns.  Dozens of these columns contain formulas.  When I make a change somewhere I notice the status bar says "evaluating ##column_xyz" and it proceeds to evaluate every formula in the data table.  Since this is a rather large table and there are many formulas, it takes up to 60 seconds to evaluate all of them.  Consequently, I am very sensitive to performing actions which trigger formula evaluation, because the user interface will become unresponsive until it finishes and this can be quite annoying when it happens unexpectedly.  On the other hand, I do not want to "Suppress Formula Eval" all the time because sometimes I will forget about it and data won't be updated when it should have been.

 

So on this large data table I have a JSL script that generates a set of charts using the Graph Builder platform, applies some local data filters, performs some selective highlighting of the data.  Essentially the purpose of the script is to dynamically generate charts and prepare them for cut-and-paste into an email or report.

 

I noticed that this script seems to take a long time to run.  Longer than it should.  The UI goes unresponsive for almost 60 seconds before the graphs finally come up.  I spent some time trying to figure out what is happening during those 60 seconds.  Then it dawned on me that somehow formula evaluation is being triggered by this script.  This was not immediately obvious because the data table status bar doesn't show the normal "evaluating..." message; it remains blank, presumably because the evaluation is being triggered by a script instead of by user action.  When I ran the script while the Script Log window was open, the log status bar said "evaluating ##column_with_formula".  The log status bar was frozen on that message even though the duration of the delay leads me to believe it was, in fact, evaluating every formula column in the table.  And then I confirmed that formula evaluation was the culprit by enabling "Suppress Formula Eval" for the entire table and running the script again - the graphs came up almost instantly.

 

Through trial and error I narrowed it down to one line in the script - a "Select Where" is used to highlight some of the data on the generated graphs.  It looks something like this:

 

myDT << Select Where( :Name("##column_with_formula") == "value" );

I thought this was strange; maybe it had something to do with the "where" part.  So I replaced this line with something simpler:

 

myDT << Select Rows( {1,3,5,7} );

Even this line still seems to trigger formula evaluation behind the scenes.  Then I did some more experimentation:

  • With task manager open, I triggered formula evaluations the normal way (change a value in a column).  The status bar showed formulas being evaluated and instantly my CPU usage shot up to 30% (one CPU core maximized).
  • I waited for the activity to stop, and I tried selecting a row on the data table using the UI.  The status bar did not change; it still showed "evaluations done".  But task manager showed exactly the same CPU activity as if formula evaluation were happening.
  • I enabled "Suppress Formula Eval" for the data table and selected a different row on the data table.  The CPU usage did not spike up.

 

To summarize, through experimentation I have demonstrated the following:

  1. JMP performs column formula evaluation superfluously; basic interactions with the user interface cause calculation to occur in the background even though none of the input data has actually been modified.
  2. JMP is not transparent as to when it is performing these formula evaluations.  The status bar sometimes displays a message but most of the time gives no indication that this is occurring.  This often manifests as user interface lag / stuttering / unresponsiveness for no apparent reason.
  3. It seems like if one column is triggered for evaluation, ALL columns are evaluated, regardless of whether or not they needed it.

 

With smaller data sets and fewer, less-complex formulas, the excessive formula evaluation doesn't seem to be an issue.  But with enough complex formulas and a large data set, this really becomes a hindrance.  In my case, I guess I will have to adapt to leaving formula eval suppressed all the time and only evaluating formulas manually.  But this is prone to mistakes, just like in Excel when workbook recalculation is set to manual.  At least Excel has a dependency tree that allows it to evaluate only the necessary formulas for the data that has changed, and if the user designs the formulas appropriately the calculation can be left in Automatic all the time with minimal overhead.  Admittedly I am not as familiar with JMP as I am with Excel, so there may be ways to deal with this in JMP that I have not learned yet. 

 

Does anyone have any suggestions on how to manage large numbers of complex formulas WITHOUT suppressing eval all the time?

 

Also, as stated in the title of my post, I would be interested to know WHY a change in row selection trigger formula eval.  And also, why does does JMP sometimes tell you it's evaluating formulas, and sometimes hides it?

 

0 Kudos
9 REPLIES 9
txnelson
Super User

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

The recaclulation is done, because a formula could be effected by a change in the selection/exclusion of a cell for a given column.  Given the size of your data table, I would look into changing any columns where the formula values will not be changing, and then remove the "Formula" property, turning them into just a typical numeric column.

Jim
Chris_Rodrigues
Community Trekker

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

Jim, thanks for the reply.  This table contains time-series data and I regularly import the most current data.  I would want to preserve the formulas so that they are calculated on the new data after it has been added to the table.

 

For the majority of these formulas, the value would only depend on the other data in that same row.  Excluding data earlier in the table would have no effect on the values below so recalculation of the entire column is unnecessary but it happens anyway.  It would be nice if this only happened if the formula contained a Col Maximum() or Lag() or other such function that would cause the formula to be affected by values in other rows.

 

Also, the most surprising part for me was that the mere selection of a row triggered a full table re-evaluation (as evidenced by CPU usage and UI responsiveness).  If a formula could be affected by excluding a row, why not trigger evaluation when the user actually excludes a row?  I guess it's theoretically possible that a user could write a column formula that depends on the current row selection state, but that seems like a rare edge case to me.

0 Kudos
Craige_Hales
Staff (Retired)

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

Yes, if any formula is dependent on the row-state information, changing the row selection will reevaluate that column and any columns that are dependent on it.

You may be able to speed the evaluation up, a lot, by using dt<<runFormulas after the change in the row states. 

Craige
Chris_Rodrigues
Community Trekker

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

Craige, my point was that all of the formulas recalculate whether or not they are dependent on row state.    The vast majority of formulas do not have anything to do with row state, but a change in row state causes them to recalculate anyway.  I guess JMP calculates everything, all the time, if anything happens that may affect any type of formula that could possibly exist.  In my mind that's unnecessary and inefficient.  Like using a battle axe when a scalpel would suffice.  I'm sure it's easier to code the program that way; it just seems like an unsophisticated approach and it surprised me when I learned this is how JMP handles formula eval.

 

Now that I understand better how formulas are calculated, I can certainly work around it by keeping formula eval suppressed and inserting dt << runFormulas; and Wait(0); statement throughout my scripts.  It won't be pretty, but it'll work I suppose.

0 Kudos
Chris_Rodrigues
Community Trekker

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

In the Preferences menu under "Tables" there is an option for "Suppress Formula Eval on Open (Not recommended)".  I turned this option on but it doesn't seem to work.  When I open up a data table it starts out with formula eval unsuppressed and I still have to suppress manually.  Is this option broken or am I misunderstanding what it's supposed to do?

0 Kudos
Craige_Hales
Staff (Retired)

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

What version of JMP are you running? 

As far as I can tell, 14 is doing what I expect; here's my test JSL

New Table( "Untitled", add rows(3),
	// EVERYTHINGCOL depends on everything. it evaluates last.
	New Column( "EVERYTHINGCOL", Formula( Write( "\!nEVERYTHINGCOL,row" || Char( Row() ) ); :ROWSTATECOL + :INDEPCOL + :DEPONINDEP; ) ),
	// DEPONINDEP depends on INDEPCOL, but not ROWSTATECOL.
	New Column( "DEPONINDEP", Formula( Write( "\!nDEPONINDEP,row" || Char( Row() ) ); :INDEPCOL + 1; ) ),
	// INDEPCOL is independent. It evaluates early, maybe first.
	New Column( "INDEPCOL", Formula( Write( "\!nINDEPCOL,row" || Char( Row() ) ); 17; ) ),
	// ROWSTATECOL is directly dependent on row state information. It evaluates early, maybe first. 
	New Column( "ROWSTATECOL", Formula( Write( "\!nROWSTATECOL,row" || Char( Row() ) ); Selected( Row State() ); ) ),
)

Watch the log; the columns report as they evaluate. Selecting rows causes two of the columns to evaluate, but the other two do not.

Craige
Chris_Rodrigues
Community Trekker

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

Craige, that's a good idea to write to the log when a formula is evaluating.  I will use this technique to further troubleshoot the UI lag and delays in my project that are related to formula evaluation.  My method was not as precise.  But I do have a question - is there a 'debug' mode or 'verbose logging' feature in JMP that I could turn on?  It would be nice if I didn't have to put so many Write statements in my code just to figure out what JMP is doing at any given moment.

 

However, for now, I still need to use this project as-is for my current work.  I just want to suppress formula evaluation completely and re-run the formulas manually as needed.  Even though I enabled "Suppress Eval on Open" in the preferences menu, JMP still evaluates the formulas immediately when I open the table.  Is there a way to fully prevent all automatic formula evaluation on this table?  If this preference is supposed to do this but it isn't working, is there a way I can submit a bug report?

 

FYI, I am using JMP PRO 14.1 on Windows 10 x64.

0 Kudos
Craige_Hales
Staff (Retired)

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

I have not been able to make a table that has formulas evaluating when it opens. @chungwei 

 

The preference might have an effect on JMP tables created in JMP 5 or so.  I can't see that it does anything currently.

 

It is possible to use the data table's red triangle options to suppress formula evaluation and save a table in that state (with partially evaluated columns). You can use rerunFormulas from the same red triangle to force the update (or from JSL). (This might be what you are asking for.)

 

The reason the pref says not recommended is it becomes very confusing when tables get the suppress bit set...or it did at one time.

 

If you go though tech support, they will enter and track a defect (or figure out why the preference exists and what it did/does for older format tables.)

 

There is no debugging/logging framework.

Craige
Chris_Rodrigues
Community Trekker

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

This table was not created in an old version of JMP.  It is maybe a year old so it was either JMP 13 or 14.

 

I used the red triangle menu to check the suppress eval box for the table, then I saved it and closed it.  Upon reopening the table, the suppress eval box is no longer checked, so the table-wide setting apparently wasn't preserved by saving the table.  When the table opens the status bar immediately displays "evaluating xyz...".  I'm not sure what is triggering this to happen.

 

I had an OnOpen script that was being used to populate data into some associative arrays that are used in some data lookup functions.  I renamed OnOpen to something else to disable it so it doesn't fire when I open the table, but the evaluations still happen.

 

I wrote a JSL script to go into the column properties for most of the formula columns and suppress eval there, because I know this option is preserved when the table is saved. So now the full-table eval doesn't take nearly as long.  I guess this is how I will handle it from now on: suppress eval on individual columns, and when I import new data that needs the formulas to run, un-suppress them temporarily using a script.

0 Kudos