BookmarkSubscribeSubscribe to RSS Feed

Multithreading in JSL

As computing has advanced, I've gone from 1 core to 2 cores to 4 cores... to 64 threads on my workstation. Meanwhile .jsl has effectively remained single threaded. I know some platforms will utilize all threads, but I'd like to be able to farm out simple jsl operations at will to multiple threads. For example, if I have a script which operates row-wise on 10,000,000 rows of data, if I'm iterating by row in a for loop and I know each row's result is independent of the other rows, I'd like to split that operation into 64 asynchronous for loops, pause other execution while the 64 loops complete in parallel, then move back to a single threaded, synchronous script after all are complete, much like I would in .py with Pool() and poolObject.join.

Tracking Number:

Defect ID: S1420157

1 Comment
Staff

 As far as farming out simple JSL operations to multiple threads, there is a function called Parallel Assign() for that.It breaks the work into chunks where the code for each chunk results in a single number, and the result of the entire operation is a matrix of numbers.

 

There is a toy example in the scripting index for Parallel Assign(), and the second example for Frame Size() shows a more elaborate example for computing Mandelbrot set data. Another small example is in @Craige_Hales's JSL Cookbook post on loops and there's more about Mandelbot in his blog.

 

I've included another example below which finds the largest file in a list of files using multi-threading. The script runs a single-threads version first, and you can see both spurts of activity in this task manager screenshot. The single-thread version is the long mesa and the parallel version is the tall peak. For 17,000 files on 8 cores, the time went from 13s to 4.5s, or 3x faster. Not ideal, but as is often the case, the threads are sharing some resources, such as the disk access in this case.parallelfilesize.png

 

 

 

The JSL that executes in parallel does need to be simple so that it doesn't modify the shared global state. That's why some globals are copied in as the first step. It also means that the parallel code can't do things like create data tables or interact with the display. 

 

We're interested in knowing what kind of other simple actions where we might need a richer interface. One that I know of is for each worker to return items other that numbers. Essentially, that would mean using a list for the results instead of a matrix.

 

----

 

Names Default To Here( 1 );
// Example for Parallel Assign().

// get a list of a lot of files
dir = If(Host is("Mac"), "$SAMPLE_DATA", "$JMP_HOME");
all files = Files In Directory( dir, "recursive" );
naf = N Items( all files );

// convert relative paths to full paths
For( i = 1, i <= naf, i++,
	all files[i] = dir || "/" || all files[i]
);

// Function that looks at a range of a given file list and returns
// the index of the largest file.
// Importantly, this function uses no global variables.
largest file = Function( {files, first, last},
	{i, i large = -1, bytes large = -1, bytes}, 
	For( i = first, i <= last, i++,
		bytes = File Size( files[i] );
		If( bytes > bytes large,
			bytes large = bytes;
			i large = i;
		);
	);
	i large;
);

// Single-threaded approach: see all the files to the function.
t0 = Tick Seconds();
lf = largest file( all files, 1, naf );
t1 = Tick Seconds();
Show( naf, lf, all files[lf], t1 - t0 );
// about 13sec on my machine (17k files on Win)

wait(1);	// so we can see a break in a CPU activity monitor

// Multi-threaded approach: break the list up into 8 ranges that
// can be handled by separate threads.
t0 = Tick Seconds();
num chunks = 8;	// JMP may create this many threads, or fewer

// initialize to -1, so we can detect failure if desired
results = J( num chunks, 1, -1 );
Parallel Assign(
	// copy these globals into the thread's local variables
	{tf = Name Expr( largest file ), tfiles = all files, tn = num chunks},
	// make this assignment for each chunk:
	results[i] = Local( {n = N Items( tfiles )},
		tf( tfiles, 1 + Floor( (i - 1) * n / tn ), Floor( i * n / tn ) )
	)
);
// We need to call the function once more to see which of the top
// files is the overall largest.
top files = all files[results];
toplf = largest file( top files, 1, num chunks );
lf = results[toplf];
t1 = Tick Seconds();
Show( naf, lf, all files[lf], t1 - t0 );
// about 4.5sec