cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Try the Materials Informatics Toolkit, which is designed to easily handle SMILES data. This and other helpful add-ins are available in the JMP® Marketplace
Choose Language Hide Translation Bar
Chris_Rodrigues
Level III

Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

I have a data table with approximately 100,000 rows and 2000 columns.  Dozens of these columns contain formulas.  When I make a change somewhere I notice the status bar says "evaluating ##column_xyz" and it proceeds to evaluate every formula in the data table.  Since this is a rather large table and there are many formulas, it takes up to 60 seconds to evaluate all of them.  Consequently, I am very sensitive to performing actions which trigger formula evaluation, because the user interface will become unresponsive until it finishes and this can be quite annoying when it happens unexpectedly.  On the other hand, I do not want to "Suppress Formula Eval" all the time because sometimes I will forget about it and data won't be updated when it should have been.

 

So on this large data table I have a JSL script that generates a set of charts using the Graph Builder platform, applies some local data filters, performs some selective highlighting of the data.  Essentially the purpose of the script is to dynamically generate charts and prepare them for cut-and-paste into an email or report.

 

I noticed that this script seems to take a long time to run.  Longer than it should.  The UI goes unresponsive for almost 60 seconds before the graphs finally come up.  I spent some time trying to figure out what is happening during those 60 seconds.  Then it dawned on me that somehow formula evaluation is being triggered by this script.  This was not immediately obvious because the data table status bar doesn't show the normal "evaluating..." message; it remains blank, presumably because the evaluation is being triggered by a script instead of by user action.  When I ran the script while the Script Log window was open, the log status bar said "evaluating ##column_with_formula".  The log status bar was frozen on that message even though the duration of the delay leads me to believe it was, in fact, evaluating every formula column in the table.  And then I confirmed that formula evaluation was the culprit by enabling "Suppress Formula Eval" for the entire table and running the script again - the graphs came up almost instantly.

 

Through trial and error I narrowed it down to one line in the script - a "Select Where" is used to highlight some of the data on the generated graphs.  It looks something like this:

 

myDT << Select Where( :Name("##column_with_formula") == "value" );

I thought this was strange; maybe it had something to do with the "where" part.  So I replaced this line with something simpler:

 

myDT << Select Rows( {1,3,5,7} );

Even this line still seems to trigger formula evaluation behind the scenes.  Then I did some more experimentation:

  • With task manager open, I triggered formula evaluations the normal way (change a value in a column).  The status bar showed formulas being evaluated and instantly my CPU usage shot up to 30% (one CPU core maximized).
  • I waited for the activity to stop, and I tried selecting a row on the data table using the UI.  The status bar did not change; it still showed "evaluations done".  But task manager showed exactly the same CPU activity as if formula evaluation were happening.
  • I enabled "Suppress Formula Eval" for the data table and selected a different row on the data table.  The CPU usage did not spike up.

 

To summarize, through experimentation I have demonstrated the following:

  1. JMP performs column formula evaluation superfluously; basic interactions with the user interface cause calculation to occur in the background even though none of the input data has actually been modified.
  2. JMP is not transparent as to when it is performing these formula evaluations.  The status bar sometimes displays a message but most of the time gives no indication that this is occurring.  This often manifests as user interface lag / stuttering / unresponsiveness for no apparent reason.
  3. It seems like if one column is triggered for evaluation, ALL columns are evaluated, regardless of whether or not they needed it.

 

With smaller data sets and fewer, less-complex formulas, the excessive formula evaluation doesn't seem to be an issue.  But with enough complex formulas and a large data set, this really becomes a hindrance.  In my case, I guess I will have to adapt to leaving formula eval suppressed all the time and only evaluating formulas manually.  But this is prone to mistakes, just like in Excel when workbook recalculation is set to manual.  At least Excel has a dependency tree that allows it to evaluate only the necessary formulas for the data that has changed, and if the user designs the formulas appropriately the calculation can be left in Automatic all the time with minimal overhead.  Admittedly I am not as familiar with JMP as I am with Excel, so there may be ways to deal with this in JMP that I have not learned yet. 

 

Does anyone have any suggestions on how to manage large numbers of complex formulas WITHOUT suppressing eval all the time?

 

Also, as stated in the title of my post, I would be interested to know WHY a change in row selection trigger formula eval.  And also, why does does JMP sometimes tell you it's evaluating formulas, and sometimes hides it?

 

13 REPLIES 13
hogi
Level XII

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

I had similar issues with a large data table.
Formula evaluations were triggered again and again.
After saving the table and re-opening, everything was OK.

So, perhaps the re-evaluation was triggered by  a variable name conflict . UAnd the issue was gone after resetting the variable name.

 

If other users have similar issues, here is a wish to quickly enable and disable individual formulas:
fast way to enable/disable formula calculations 

 

Another idea: disable formula evaluations by default - to have a smooth Jmp without such spurious formula-reevaluations.

If you add some data to the data table, just trigger a one-time formula evaluation. Here is the wish:
fast way to trigger one-time formula evaluation for a specific formula 

 

Chris_Rodrigues
Level III

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

Well it's nice to finally hear I'm not the only one who struggled with this.  Almost four years later this still has not gotten any better.  There are all kinds of weird situations that trigger formula evaluation on the entire data table, even when formula eval is supposedly suppressed table-wide.  Suppressing formula eval on individual columns is unworkable because there is no way to handle dependencies.  All the columns have to be unsuppressed for JMP to eval them in the correct order.  Also, in JMP16 the formula evaluation engine even seems to have gotten even slower.  It now takes an unreasonable amount of time to eval my full table.

 

Between this, and the constant suggestions in these forums to just delete all the formulas and store fixed values only, I have come to the conclusion that JMP is simply not meant to handle dynamic formulas, only static data.  Formulas are nice at first, when your table is small and they still work properly, but when the formula count, formula complexity, or row count exceed a certain limit, you will hit a brick wall.  I am going to have to completely rework this project using some other tool, and use JMP for graphing/presentation/statistics only.

 

I am going to mark this as solved.  The solution is that there is no solution.  If you are experiencing this problem then JMP is not the right tool for the job.  Delete your dynamic formulas and store static data only.

hogi
Level XII

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?

@Chris_Rodrigues wrote:

Also, the most surprising part for me was that the mere selection of a row triggered a full table re-evaluation (as evidenced by CPU usage and UI responsiveness).  If a formula could be affected by excluding a row, why not trigger evaluation when the user actually excludes a row I guess it's theoretically possible that a user could write a column formula that depends on the current row selection state, but that seems like a rare edge case to me.


This also confuses me. Why doesn't Jmp distinguish between the row states: excluded and selected?
Even if the RowState was changed - there could be a subsequent check to see if the excluded() states changed.

I agree with @Chris_Rodrigues  - most column formulas will use Excluded() and not Selected().
[However, the ones with Selected() are the coolest :)]

 

Doesn't sound like a niche issue. For sample data with 5-1000 rows of data, this is not a big problem.
But for larger datasets, a user who uses Excluded() in column formulas has to pay a huge penalty!

 

 

 

 

Names Default To Here( 1 );
dt = New Table( "test",
	add rows( 5 ),
	New Column( "excluded",
		Formula(
			Print( As Date( today() ) );
			Excluded();
		)
	)
);

 

 

hogi
Level XII

Re: Why does selecting/deselecting rows trigger column formulas to be re-evaluated?