JSL loops have an undeservedly bad reputation. If the guts of the loop do much work, the loop's overhead will usually not matter much. And any solution without a JSL loop is going to involve a loop somewhere else, perhaps in C++ which will be a bit faster, but still. Anyway, here's a comparison of two methods, one involving a JSL loop and one using a loop hidden away in JMP's C++ code.
Graph showing crossover times of the two algorithms at around 39,000 elements.
dtGraph = New Table( "stats",
Add Rows( 0 ),
New Column( "size", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [] ) ),
New Column( "table time", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [] ) ),
New Column( "loop time", Numeric, "Continuous", Format( "Best", 12 ), Set Values( [] ) )
);
For( n = 1, n < 1e7, n *= 2,
dtGraph << addrows( 1 );
dtGraph:size[N Rows( dtGraph )] = n * 3;// 3 items repeated...
list1 = Repeat( {"a", "b", "c"}, n );
list2 = Repeat( {"1", "2", "3"}, n );
start = HP Time(); // data table method begins here
dt = New Table( "temp",
New Column( "a", Character, "Nominal", Set Values( list1 ) ),
New Column( "b", Character, "Nominal", Set Values( list2 ) ),
New Column( "c", Character, "Nominal", formula( a || ":" || b ) ),
invisible
);
dt << runformulas;
result = dt:c << getvalues;
Close( dt, nosave );
stop = HP Time(); // data table method ends here
Show( stop - start );
result = Concat Items( result, "," );
Show( Length( result ), Left( result, 100 ) );
dtGraph:table time[N Rows( dtGraph )] = (stop - start) / 1e6;
result = {};
start = HP Time(); // loop method begins here
While(
a = Remove From( list1, 1 );
b = Remove From( list2, 1 );
N Items( a ); // as long as something was removed...
, // do this...
Insert Into( result, a[1] || ":" || b[1] )
);
stop = HP Time(); // loop method ends here
Show( stop - start );
result = Concat Items( result, "," );
Show( Length( result ), Left( result, 100 ) );
dtGraph:loop time[N Rows( dtGraph )] = (stop - start) / 1e6;
Wait( 1 );
);
dtGraph << Graph Builder(
Size( 1200, 500 ),
Show Control Panel( 0 ),
Variables( X( :size ), Y( :table time ), Y( :loop time, Position( 1 ) ) ),
Elements( Points( X, Y( 1 ), Y( 2 ), Legend( 5 ) ), Smoother( X, Y( 1 ), Y( 2 ), Legend( 6 ) ) ),
SendToReport(
Dispatch( {}, "size", ScaleBox, {Scale( "Log" ), Inc( 1 ), Minor Ticks( 1 )} ),
Dispatch( {}, "table time", ScaleBox, {Scale( "Log" ), Format( "Best", 12 ), Inc( 1 ), Minor Ticks( 1 )} )
)
);
JMP 14 is the oldest I have access to. I think the loop algorithm will have similar performance in JMP 12. It intentionally destroys the two lists as it processes them because older JMP versions did not have fast indexing in lists, but removing the first element is fast. The data table algorithm is first in the loop so it can use the lists before they are destroyed.
Looking back at the JSL, I shouldn't claim either algorithm is simpler or easier to follow.
Craige