I put together this simple script to divide a data table into training, test and validation subsets, by creating row state columns.
I then have one line of code at the end meant to copy the row states from the 'Training set' row state column.
The weird thing is that if I run the whole script, row states are not copied. If I run everything but the last line first, and then the last line separately, then it works.
Any ideas on how to fix this? Am I missing something or is it a bug?
New Column( "Random", Numeric, Formula( Random Uniform() ) ); New Column( "Data Set Indicator", Character, Formula( If( :Random <= 0.6, "Training", 0.6 < :Random <= 0.8, "Validation", "Test" ) ) ); New Column( "Training set", Row State, set formula( Combine States( Excluded State( :"Data Set Indicator" != "Training" ), Hidden State( :"Data Set Indicator" != "Training" ) ) ) ); New Column( "Validation set", Row State, set formula( Combine States( Excluded State( :"Data Set Indicator" != "Validation" ), Hidden State( :"Data Set Indicator" != "Validation" ) ) ) ); New Column( "Test set", Row State, set formula( Combine States( Excluded State( :"Data Set Indicator" != "Test" ), Hidden State( :"Data Set Indicator" != "Test" ) ) ) ); Column( "Training set" ) << Copy to row states();
Try sending the message Run Formulas to the data table. This message completes all formula evaluations before proceeding to the next JSL statement.
dt << run formulas;
If that doesn't work, try putting wait(0) instead.
with wait(0); JMP crashes. Run formulas, works.
Thanks! However, I still can't get my head around why
1) I create a random number
2) I create row state columns based on the random numbers
3) I assign row states based on the row state columns
Any particular reason why 2) can run without the need to specify 'run formulas' after 1), but, for 3) to work, I must specify run formulas after 2) ? Is there a logical reason or is this one of the many undocumented quirks of the JSL scripting language?
JMP's column formulas are evaluated in the background; for tables with complicated formulas and lots of rows this allows JMP to remain interactive during the formula evaluations. You can scroll the table and work with other tables and platforms.
Sending the <<runFormulas message to the table makes the formulas finish evaluating before anything else happens. Using wait(N) is not a great answer because there is no way to know how long to wait. I tried wait(0) and that was not long enough. A one second wait worked, but was quite a bit longer than needed. <<runFormulas does not re-run the formulas if they have already finished evaluating, so you can use it without worrying about making the JSL code slower.
In your example, the JSL is still running, and there has not been any opportunity for background processing of the column formulas at the point when the formula column is copied. If you run the last line separately, the background evaluation happens before the last line is executed, so the row states are updated with the computed values.
Often JMP is aware when formulas need to finish evaluating; launching a new platform, for example. <<CopyToRowStates isn't aware, yet. Thanks for pointing it out.
I did not see a crash using wait(0). If the crash is reproducible, please work with tech support to send in the script and crash report files.
As a result of this thread, since JMP 12, the <<Run Formulas() message is called automatically behind the scenes with a New Table().
There are no labels assigned to this post.