One day in 1971, Mr Hanson said “Today I’m going to teach you how to get the square root of a number. A faster way than the long-division way I’m supposed to teach. I’m going to teach you with one word. And you’ll never forget.” Then he writes on the chalkboard
Guess
and underlines it. When the laughter died down, he explained about dividing the guess into the the original number, average, repeat. He probably mentioned Newton-Raphson.
Newton’s method for finding roots or zeros of an equation uses the derivative of the equation to speed the process. The square root of 10 looks like finding a value of X that makes X^2 == 10 or X^2 – 10 == 0. The derivative of X^2 – 10 is 2X, and the heart of the loop for finding successive values of X is
Xnew = X - ( (X^2 - 10) / (2X) )
which, if you play with it, is exactly the guess formula, and converges like this, for an initial guess of 2:
2 – initial guess
3.5 – apply x = (x + 10/x)/2
3.17857142857143
3.16231942215088
3.16227766044414
3.16227766016838
3.16227766016838 – six iterations to a stable answer!
Here’s a JSL program to demonstrate.
f = Expr( x ^ 2 - V );
step = Eval Expr( x - Expr( Name Expr( f ) ) / Expr( Derivative( Name Expr( f ), x ) ) );
V = 1e2; // find the sqrt of V
x = -8; // initial guess.
xnext = .; // set up loop
n = 0; // counter
While( 1,
xnext = step;
If( Abs( x - xnext ) < Abs( 1e-14 * x ),
Break()
);
x = xnext;
Write( "\!n", n, " ", x );
n = n + 1;
);
Write( "\!ndone, a root of " || Char( Name Expr( f ) ) || " for V = " || Char( v ) || " is ", x );
The first line defines an expression, f, which will help find the square root of V. The second line, step, is the Newton-Raphson iteration for one cycle; step appears in the while loop where it evaluates to the next X value. The Derivative function works great on simple polynomials, giving 2X in this example. The NameExpr() gets the formula from f without evaluating it and the Expr() makes the NameExpr() run (to get the formula) when EvalExpr() finds the Expr(). Step is set to x - f(x)/f’(x).
Inside the while(1) loop, steps are taken and printed and counted. The test is checking the difference of the current and next X values; if the value is very small at the current size of X, it probably won’t get closer and might oscillate between two very close numbers, so it breaks out of the while loop.
0 -10.25
1 -10.0030487804878
2 -10.0000004646115
3 -10
done, a root of x ^ 2 - V for V = 100 is -10
Remember there is both a positive and a negative square root! Starting values determine which root the convergence leads to. Starting at -8 overshot to -10.25, then settled in to -10.
(Why does it divide by the derivative, f’ ? Intuitively, when the slope of the line f is close to zero, the next point will need to be further away to change the Y result of f, and dividing by the slope does that.)
The JSL above does not handle impossible requests well. If asked to find sqrt(-100), the JSL loops, looking for a complex number it is not equipped to find.
Complex Numbers
In the JSL scripting index, searching for complex produces two different half-baked examples. One is based on JSL classes and one based on namespaces and matrices. I think the matrix version is closer to a robust solution, and probably faster. The class version is tantalizing, but I had trouble using it (speed, memory issues). Since JMP does not have user overloads for operators, neither one is going to be easy to use.
This JSL is very similar to the previous JSL. There is a big chunk of code to handle the complex number arithmetic, the same formula, more-or-less, and now it handles sqrt(-100) too.
ns = New Namespace("complex", // adapted from scripting index example
{
make = Function( {a, b}, a || b ),
add = Function( {x, y, t=[0 0], u=[0 0], v=[0 0], w=[0 0]}, // an example has 5 terms
if( Is Number( x ),x = x || 0);
if( Is Number( y ),y = y || 0);
x + y + t + u + v + w;
),
sub = Function( {x, y, t=[0 0], u=[0 0], v=[0 0], w=[0 0]}, // an example has 3 terms. this sub,add trick is good enough for a demo.
if( Is Number( x ),x = x || 0);
if( Is Number( y ),y = y || 0);
x - y - t - u - v - w;
),
mul = Function( {x, y},
if( Is Number( x ),x = x || 0);
if( Is Number( y ),y = y || 0);
local:z = J( 1, 2 );
local:z[1] = x[1] * y[1] - x[2] * y[2];
local:z[2] = x[1] * y[2] + x[2] * y[1];
local:z;
),
div = Function( {x, y},
if( Is Number( x ),x = x || 0);
if( Is Number( y ),y = y || 0);
local:z = J( 1, 2 );
local:d = (y[1] ^ 2 + y[2] ^ 2);
local:z[1] = (x[1] * y[1] + x[2] * y[2]) / local:d;
local:z[2] = (x[2] * y[1] - x[1] * y[2]) / local:d;
local:z;
),
// https://math.stackexchange.com/questions/1397437/formula-for-raising-a-complex-number-to-a-power
// If a,b are real, then a+ib=r(cosθ+i*sinθ) where r^2=a^2+b^2 and tanθ=b/a, and (a + i*b)^N = r^N(cos(Nθ) + i*sin(Nθ))
pow = function( {c, n},
if( Is Number( c ),c = c || 0);
local:a = c[1];
local:b = c[2];
local:theta = ATan( local:b, local:a );
local:r = Sqrt( local:a ^ 2 + local:b ^ 2 );
local:r ^ n * (Cos( n * local:theta ) || Sin( n * local:theta ));
),
abs = function( {x},
Sqrt( x[1] ^ 2 + x[2] ^ 2 )
),
toChar = Function( {x,w=20,d=14}, // rounding (to 14 places) is not a good idea if the numbers are small ( 1e-9 will lose lots of precision)
Char( round(x[1],d),w ) || "+" || Char( round(x[2],d),w ) || "i"
),
convertExpression = function( {e},
substitute( Name Expr( e ),
Expr( Add() ), Expr( complex:add() ),
Expr( Subtract() ), Expr( complex:sub() ),
Expr( Multiply() ), Expr( complex:mul() ),
Expr( Divide() ), Expr( complex:div() ),
Expr( Power() ), Expr( complex:pow() )
)
)
}
);
Above added a pow() function for x^2 or x^3 etc where x is a complex number, an abs() function which maybe should be called magnitude() but matches up with the non-complex version as abs, some odd-looking extra parameters to add() and sub(), and a convertExpression() function. Big shout-out to stackexchange/Michael Hardy for an easy-to-grasp pow() function!
The convertExpression function massages a JSL expression with +, -, *, /, and ^ operators into an equivalent expression that calls the complex namespace functions. The JSL binary operators are pretty aggressive about turning ((a+b)+(c+d)) into add(a,b,c,d) which is not a binary function...it has four arguments. That’s why there are extra option arguments on the complex:add and :sub functions.
Next bit of JSL moves the loop into a function, anticipating future development:
findzero = function( {x},
{xnext, n},
xnext = .; // set up loop
n = 0; // counter
While( n < 200,
xnext = step;
If( complex:Abs( complex:sub( x, xnext ) ) < 1e-14 * complex:abs( x ),
Break()
);
x = xnext;
n = n + 1;
);
If( n == 200, x = [. .] );
Eval List( {x, n} );
);
findzero takes a starting point x and returns a list with the iterated solution x and the iteration count n.
And the code to call the function…the formula and derivative are the same for complex numbers.
f = Expr( x ^ 2 - V );
step = Eval Expr( x - Expr( Name Expr( f ) ) / Expr( Derivative( Name Expr( f ), x ) ) );
step = complex:convertExpression( Name Expr( step ) );
V = complex:make( -100, 0 ); // find the sqrt of V==-100
x = complex:make( 0, 1 ); // initial guess. Does not like 0i for a start value.
{x, n} = findzero( x );
Write( "\!ndone, a root of " || Char( Name Expr( f ) ) || " for V = " || complex:toChar( v, 10, 5 ) || " is ", complex:toChar( x, 10, 5 ) );
Eval( complex:convertExpression( Name Expr( f ) ) );
The 3rd line calls the convertExpression function to change the +, -, *, /, ^ operators from the real number versions to the complex number versions. V is set to a complex value and the initial starting point is set to a point off the real number line. When done, the log shows
done, a root of x ^ 2 - V for V = -100+0i is 0+10i
[0 0]
V=-100 + 0i is the real number -100 (with no imaginary component.) 0 + 10i is the pure imaginary answer with no real component; 10i * 10i is 100*(i^2), or -100, as expected. The complex roots occur in conjugate pairs; Newton’s method found one based on the starting point. The other one is at -10i.
Let’s modify the code to keep track of the iteration count n for a bunch of points in the complex plane and graph that, perhaps as a contour plot, and use cubic instead of quadratic:
f = Expr( x ^ 3 - V );
step = Eval Expr( x - Expr( Name Expr( f ) ) / Expr( Derivative( Name Expr( f ), x ) ) );
step = complex:convertExpression( Name Expr( step ) );
V = complex:make( -2, 0 ); // find the sqrt of V==-2
dt = New Table( "complex",New Column( "xre" ),New Column( "yim" ),New Column( "iterations" ),New Column( "basinr" ),New Column( "basini" ));
stepsize=1/32;
for(xreal=-2,xreal<=2,xreal+=stepsize,
for(yimag=-2,yimag<=2,yimag+=stepsize,
x = complex:make( xreal, yimag ); // initial guess. Does not like 0i for a start value.
{x, n} = findzero( x );
dt<<addrows(1);
dt:xre=xreal;
dt:yim=yimag;
dt:iterations = n;
dt:basinr = x[1];
dt:basini = x[2];
);
write(xreal);wait(0);
);
dt<<Graph Builder(
Size( 941, 911 ),
Show Control Panel( 0 ),
Variables(
X( :xre ),
Y( :yim ),
Color( Transform Column( "basin", Formula( 100 * round(:basini,2) + round(:basinr,2) ) ) ),
Size( :iterations )
),
Elements( Points( X, Y, Legend( 62 ) ) ),
SendToReport(
Dispatch(
{},
"400",
ScaleBox,
{Legend Model(
62,
Properties( 0, {Marker Size( 4 )}, Item ID( "iterations", 1 ) ),
Properties(
1,
{gradient( {Color Theme( "jet" )} )},
Item ID( "basin", 1 )
)
)}
)
)
);
All the points in the complex plane that iterate to the same root form a basin. Above, graph builder combines the real and imaginary values into a single value to identify a root.
The image shows the three basins in red, cyan, and dark blue. All the red starting points converge to the same value. The larger dot sizes take longer to converge. The gray spots did not converge after 200 iterations:
Three basins. Larger dots where the basins touch take longer to converge.
Doubling the size of the image from 16K to 65K points takes 10 minutes or so with the JSL complex number implementation. I want to make 10,000 bitmaps of 4Kx2K points, so I need another approach.
Beowulf Cluster
A Beowulf Cluster is a pile of old computers, mostly without displays or keyboards, but connected on a LAN to a controlling computer. In my case, the controlling computer uses JSL to tell the cluster what to do and to gather the results. The Linux-based cluster is running Python programs that JMP creates. The Windows computer running JMP uses PLINK from RunProgram to open one or more SSH sessions on each computer and supply the needed program and parameters. The Python program runs, then stores the results on a NFS disk, then exits. When JMP sees the RunProgram terminate, it gathers the results and starts the next RunProgram. 10,000 RunPrograms later there are enough images to make a five minute video.
Ethernet hub in center, below power-strip. Not enough mice and monitors and keyboards, SSH is best.
Python has a native complex number data type, so some of the JSL above goes away. The JSL now looks like this; you’ll see some Python embedded as a string in a bit.
F = Expr( // f and e parameters are driven by the frame animation time
f6 * xOld ^ e6 + f5 * xOld ^ e5 + f4 * xOld ^ e4 + f3 * xOld ^ e3 + f2 * xOld ^ e2 + f1 * xOld ^ e1 - 1
); // <<<<<<<<<<< supply equation here <<<<<<<<<<<<<<
scale = 2 / 1;
xresolution = 1920 * scale;
yresolution = 1080 * scale;
// keep aspect ratio by chopping shorter dimension...
xcenter = -1;
ycenter = 0;
delta = 3;
ytop = ycenter - delta * yresolution / xresolution;
xleft = xcenter - delta;
xright = xcenter + delta;
ybottom = ycenter + delta * yresolution / xresolution;
The bit above includes the equation and some video size parameters. The bit below describes the network. Hostkey, server, cpus, running, gigs are indexes into the condat (connection data) list of information about the cluster.
// named indexes for condat
hostkey = 1; // from a failed connection, this makes the connection work
server = 2; // the machine names
cpus = 3; // from the lscpu command, below
running = 4; // dynamic: how many are running on this cpu, right now
gigs = 5; // from the free command, below
// the avahi-daemon may need to be stop/start/status sometimes,
// unfortunately that is hard to do except at server keyboard if you can't connect...
The problem with the daemon still exists, but I now know it is advertising the machines with a -2 (etc) suffix and I can fix it without connecting a display.
condat = {
{"4e:c3:8a:3b:e7:d4:bd:0d:85:62:36:40:17:15:b4:a3", "dell790a.local", 4, 0, 22},
{"00:c2:37:09:dc:d7:cf:fb:33:e3:14:e3:d6:fb:39:be", "dell790b.local", 4, 0, 14},
{"54:c8:1a:6a:50:e3:51:02:ed:1b:f8:fa:bd:f4:e4:17", "dell7010a.local", 8, 0, 30},
{"66:0d:ca:3c:1f:11:eb:45:d4:92:d0:0b:ae:01:b8:6b", "dell7010b.local", 8, 0, 30},
{"94:84:2f:a5:56:2d:db:16:b5:80:a9:0d:87:07:ee:53", "dell9010.local", 8, 0, 30},
{"b2:b7:b3:c3:83:63:11:e7:56:e7:c9:c2:e6:fc:f0:c9", "thinkb.local", 8, 0, 30},
{"2b:ca:fc:45:f0:64:cd:ac:70:ce:93:73:e8:84:2d:75", "thinkc.local", 8, 0, 30},
{"62:b7:67:23:8d:86:fe:80:1a:62:ec:d8:41:1b:b4:1b", "thinkd.local", 8, 0, 30},
{"8e:54:ee:b0:25:67:b8:5c:4d:ad:5e:e9:71:87:fd:71", "netflix.local", 2, 0, 30} // host for vm
};
The JSL below looks through the condat list and converts the names like dell9010 into their dotted IP address (or IPV6 equivalent.) It prevents lookup failures later; the cluster ran for about three days and I’m still not sure why my wife’s desktop is providing the DNS lookup service…Before I remembered GetAddrInfo I used the commented ping code, which is a bit clumsier. My house is a lot happier with IPV6 now than it was a few years back. Sometime the IPV6 address belongs in square brackets.
// on my local network, the lookups for xxx.local fail randomly,
// and more often when busy. pre-lookup the dotted ip now.
//
// better way? Get Addr Info( "thinkd.local" )[3][4]
//
for( itry = 1, itry < 10, itry += 1,
nprocs = 0;
MAXCPU = 0;
good = 0;
for( iCondat = 1, iCondat <= N Items( condat ), iCondat += 1,
nprocs += condat[iCondat][cpus]; // nprocs will divide up the work (stepsize)
condat[iCondat][running] = 0;
MAXCPU = Max( MAXCPU, condat[iCondat][cpus] ); // MAXCPU loops needed later
if( Starts With( condat[iCondat][server], "[" ),
good += 1, // else
// txt = Run Program(
// Executable( "PING.EXE"/*path probably not needed*/ ),
// Options( {"-n 1", condat[iCondat][server]} ),
// ReadFunction( "text" )
// );
// ip = Regex( txt, "\[\[([^]]+)\]]\", "\0" ); // typ: [2603:6081:1a02:1200:8973:e7be:605c:f753]
try(
ip = "[" || Get Addr Info( condat[iCondat][server] )[3][4] || "]";
if( !Is Missing( ip ),
Write( "\!n", condat[iCondat][server] || " --> " || ip );
condat[iCondat][server] = ip;
good += 1;
, //
Write( "\!n", condat[iCondat][server] || " lookup failed ", Left( txt, 70 ) )
); //
, // catch
show(exception_msg,iCondat,condat[iCondat][server],Get Addr Info( condat[iCondat][server] ));
);
);
);
if( good == N Items( condat ),
Break(),
Wait( .1 )
);
);
Write( "\!n",good," of ", N Items( condat )," names resolved." );
The next bit is about credentials: the ppk is the JMP machine’s pivate key. The public part of that key is on each machine in the cluster. Plink needs that private bit to sign the SSH connection to the remote machine. The if(0) code is there so I can experiment on all the machines. At the end of the day, I might send them all the shutdown command. Be careful with the shutdown command if the condat list holds the Linux machine that is running the VirtualBox that is running this JSL script (line 9 above.)
remoteuser = "c@";
localkey = " -i c:\Users\v1\Desktop\keys\private.ppk ";
// test code examples...
if( 0, // sequential test using command
//command = " \!"python3.6 -m pip install numpy";
//command = " \!"python3.6 -m pip --version";
//command = " \!"python3.6 -m pip install -U pip";
//command = " \!"python3.6 --version";
command = " \!"hostname -A";
command = " \!"top -b -n 2|head -n 15";
// check if the common nfs mount is set up
//command = " \!"ls /mnt/nfsc"; // fstab: closet.local:/mnt/nfs /mnt/nfsc/ nfs defaults 0 0
// cycle the power; -r means restart. for this to work, must be sudo and not need password
//command = " \!"sudo shutdown -r now"; // visudo: c ALL=(ALL) NOPASSWD: /sbin/poweroff, /sbin/reboot, /sbin/shutdown
//command = " \!"sudo shutdown now"; // visudo: c ALL=(ALL) NOPASSWD: /sbin/poweroff, /sbin/reboot, /sbin/shutdown
// info about number of cores
//command = " \!"lscpu | egrep '^(Model name|Thread|CPU\(s\):)'";
// info about memory
//command = " \!"free -h";
for( iCondat = 1, iCondat <= N Items( condat ), iCondat += 1,
result = runprogram(
executable( "plink" ),
options(
localkey ||
/**/ " -hostkey " || condat[iCondat][hostkey] || /* this line hides the host key */
" " || remoteuser || condat[iCondat][server] || command
),
write function( Function( {this}, this << Write( "n\r\n" ) ) ),
readfunction( "blob" )
);
result = Regex( Blob To Char( result ), "(\r\n)+", "\r\n", GLOBALREPLACE );
Write( "\!n\!n", iCondat, " ", condat[iCondat][server], " ", result );
Wait( 0 );
);
Write( "\!n\!ntest done" );
);
The next bit is about running many copies of RunProgram concurrently. ImgNum is the key for each of these associative arrays and is the number of the image the Python program should be working on. ImgNumToRP holds a RunProgram object that is running. That object is moved to imgNumToRPdone as it is terminating. imgNumToResult holds the text output of the Python program, imgNumToMachine the id of the machine and imgNumToLabel a label for the picture. (The labels are applied later, more below.)
imgNumToRP = [=> ]; // holder for active RunProgram objects
imgNumToRPdone = [=> ]; // holder for RunProgram objects that need a moment to terminate
imgNumToRPresult = [=> ""]; // result strings from RunProgram objects
imgNumToMachine = [=> 0]; // which machine is this running on?
imgNumToLabel = [=>""]; // f(x) labels for graphs
nextImageNumberToMakePicture = 0;
// also: keep track of zeros from previous image
The unstick function handles the RunPrograms that don’t call the ReadFunction one last time after RunProgram terminates.
unstick = function({},{imagenumber},
for( imagenumber = imgNumToRP << first, !Is Empty( imagenumber ), imagenumber = imgNumToRPnext,
imgNumToRPnext = imgNumToRP << next( imagenumber );
if( (imgNumToRP[imagenumber]) << canread == 0 & (imgNumToRP[imagenumber]) << canwrite == 0 & (imgNumToRP[imagenumber]) << isreadeof == 1,
imgNumToRPresult[imagenumber] = "\!n" || char(imagenumber) || ": " || imgNumToRPresult[imagenumber] || "\!n"; // NoInterleave save this with count and basin
imgNumToRPdone[imagenumber] = imgNumToRP[imagenumber];
imgNumToRP << removeitem( imagenumber );
//if(imgNumToMachine[imagenumber]!=machine,throw("b machine not match"||char(machine)||","||char(imgNumToMachine[imagenumber])));
condat[imgNumToMachine[imagenumber]][running] = condat[imgNumToMachine[imagenumber]][running] - 1;
imgNumToMachine[imagenumber] = 0;
try(
write("\!nunsticking ",imagenumber);
process( imagenumber ); // must be after...
,
show(exception_msg);
);
Break(); // restart with <<first
);
);
);
The findmachine function below looks for a machine that doesn’t have too many SSH sessions started. The unstick() function is called periodically to look for the 10% of cases that get stuck.
findMachine = function( {},{iCondat,n}, // find a machine that is not filled up
n=0;
while( 1,
for( iCondat = 1, iCondat <= N Items( condat ), iCondat += 1,
if( condat[iCondat][running] < condat[iCondat][cpus],
Return( iCondat )
)
);
if(mod(n+=1,60)==0,
write("wait for machine...");
// look for stuck
unstick();
);
Wait( 1 ); // a full second is fine; the processes run for 10s of seconds
)
);
The process function is called when JMP detects a RunProgram has finished. The Python programs write “sentinel” at the end of their output and this code checks for it as a crude check that everything is running smoothly. Bindata is the binary file the Python programs produce in the farm directory on the NFS disk. Windows and JMP see the NFS disk at the \\VBOXSVR path but Linux and the Python programs see it at the //mnt path. This code was going to produce the pictures from the binary data but that was separated into a post processing step. Now it just reports the number of zeros discovered for the formula. The binary format is explained in the python code; JMP reads it and loads it with blobtomatrix, used twice here. The relaunch at the end probably works if uncommented and needed; I forgot to uncomment it and didn’t need it.
// process output from one thread in the farm
process = function( {imagenumber},
{bindata, start, stop, balanceBytes, matdata, countArray,basinArray,zeros},
//show("process",imagenumber);
if( Contains( imgNumToRPresult[imagenumber], "sentinel" ),
// process a file. maybe not this one if it is out of order... FIXME...nextImageNumberToMakePicture
bindata = Load Text File( "\\VBOXSVR\nfsc\farm\binary" || char(imagenumber) || ".generated", blob() );
matdata = blobtomatrix(
bindata,
"uint",
2,
"little",
2
);
countArray = J( yresolution, xresolution, 0 );
basinArray = J( yresolution, xresolution, 0 );
countArray[0] = matdata[1::nrows(countArray)*ncols(countArray), 1];
basinArray[0] = matdata[1::nrows(countArray)*ncols(countArray), 2];
balanceBytes = length(bindata) - 4*nrows(countArray)*ncols(countArray);
zeros = blobtomatrix(blobpeek(bindata,length(bindata)-balanceBytes),"float",8,"little",4);
write("\!nimagenumber=",imagenumber, if(!ismatrix(zeros)," zeros ***"," zeros="||char(nrows(zeros))));
deletefile("\\VBOXSVR\nfsc\farm\testpython"||char(imagenumber)||".py"); // source no longer needed
sentinels += 1;//
, // else
Write( imgNumToRPresult[imagenumber] );
Write( "\!nrelaunch disabled" || char(imagenumber) );
//launch( imagenumber );
)
);
The launch function starts another RunProgram. It uses findmachine to wait for an available machine, and supplies a custom .py file and custom parameters to the Python command. The .py file is created further below. The throw on line 5 below is a sanity check; the JSL is complicated enough I felt like I needed to write some extra code to watch for logic errors.
// launch a thread in the farm
launch = function( {imagenumber},{command,machine},
machine = findMachine();
//show("launch",imagenumber,machine);
if(imgNumToMachine[imagenumber]!=0,throw("image number already on machine"||char(imagenumber)||","||char(machine)));
imgNumToMachine[imagenumber] = machine;
condat[machine][running] = condat[machine][running] + 1;
//command = "\[python3 /mnt/nfsc/farm/testpython]\"||char(imagenumber)||"\[.py]\";
command = "\[hostname -A;python3 /mnt/nfsc/farm/testpython]\"||char(imagenumber)||"\[.py]\";
imgNumToRP[imagenumber] = runprogram(
executable( "plink" ),
options(
localkey || /* -v */ " -hostkey " || condat[machine][hostkey] || " " || remoteuser || condat[machine][server] || " " || command || //
" -o " || Char( 0 ) || // offset
" -s " || Char( 1 ) || // step
" -x " || Char( xresolution ) || //
" -y " || Char( yresolution ) || //
" -t " || Char( ytop ) || //
" -l " || Char( xleft ) || //
" -r " || Char( xright ) || //
" -b " || Char( ybottom ) || //
" -f " || "/mnt/nfsc/farm/binary" || char(imagenumber) || ".generated" || //
""
),
The parameter keyword makes its argument available to the readfunction. This is how you can have multiple RunPrograms in parallel, and make the read functions know which one they are.
Parameter( evallist({imagenumber,machine}) ),
The readfunction captures output and tries (about 90% of the time successful) to detect the RunProgram is finished. Because the RunProgram is still terminating and still running when this code runs, the handle is moved to imgNumToRPDone rather than just being removed. That allows it to finish cleanly. Parm, below, receives the Parameter value, above. There is another sanity check (throw) below.
readfunction(
function( {this, parm}, {txt, imagenumber,machine},
{imagenumber,machine} = parm ;
//show("readfunc",imagenumber,machine);
txt = "";
//write("readfun("||char(imagenumber)||")... ");
While( this << canread, txt = txt || (this << read) );
if( Length( txt ) > 0,
//show(txt);
imgNumToRPresult[imagenumber] = imgNumToRPresult[imagenumber] || txt
);
if( this << isreadeof & imgNumToRP << Contains( imagenumber ), // this *almost* always works, about 1:100 fail to get the readeof in the readrunction
imgNumToRPresult[imagenumber] = "\!n" || char(imagenumber) || ": " || imgNumToRPresult[imagenumber] || "\!n"; // NoInterleave save this with count and basin
process( imagenumber );
imgNumToRPdone[imagenumber] = imgNumToRP[imagenumber];
imgNumToRP << removeitem( imagenumber );
if(imgNumToMachine[imagenumber]!=machine,throw("a machine not match"||char(machine)||","||char(imgNumToMachine[imagenumber])));
condat[machine][running] = condat[machine][running] - 1;
imgNumToMachine[imagenumber] = 0;
);
)
)
);
(imgNumToRP[imagenumber]) << writeeof;
);
The makePythonCode function generates the .py file with the oneStep equation baked in. The python program includes an init() function that parses the command line to get picture size and output file name info. Data is being passed in two ways: command line parameters, and baked in equations. Possibly all of the information could have been baked in, but if there is no reason (like compiling the oneStep equation into the code) to bake parameters into the code, you’d only need one copy of the code for all the RunPrograms if it was completely controlled by command line parms. The Python code is written so it can do a slice of the problem if I wanted multiple computers working on a single image. It is faster overall to do a complete image at once and not recombine parts later.
makePythonCode = function({imagenumber,oneStep},{filename,python},
python =
"\[
import cmath, math, sys, getopt
import numpy as np
def main():
offset, step, ixdim, iydim, top, bottom, left, right, filename = init() # NoInterleave offset will be 0 and step will be 1
iymax = iydim - 1
ixmax = ixdim - 1
yrange = top - bottom
xrange = right - left
result = np.zeros( (int(math.ceil((iydim*ixdim-offset)/step)),2),dtype=np.int16) # count, basin # NoInterleave simpler, no change required though
iresult = 0
ix = offset # NoInterleave like iy, now begins at zero
iy = 0
while ix > ixmax: # normalize before loop starts
ix = ix - ixdim
iy = iy + 1
y = bottom + yrange * iy / iymax # after iy is normalized, before loop
roundedComplexToBasinId = {}
basinIdToBasinSize = {} # NoInterleave this might be useful, but needs to be appended to file (now a constant size for main chunk)
nextBasinId = 1 # reserve basin 1 for "did not converge"
while iy <= iymax:
x = left + xrange * ix / ixmax
xOld = complex(x,y) #xOld - (xOld ** 5 - 1) / (5 * xOld ** 4)
count = 0
try: ## catch zero divide, rarely
for irun in range(1,1000): # NoInterleave prev stopped at 100
###xNew = xOld - (xOld ** 5 - 1) / (5 * xOld ** 4)
]\" || // maintain the indent and newlines!
The next line is the baked in bit…OneStep is a JSL character string of a valid Python statement that sets xNew from the xOld value and Newton’s equation, like the ### comment two lines above.
oneStep || "\[
count += 1
delta = abs(xNew - xOld)
dist = abs(xOld)
if delta < 1e-15 * dist:
break
xOld = xNew
xOld = complex( round(xOld.real,10), round(xOld.imag,10) )
if irun >= 999:
basinId = 1 # arbitrary, but available always. did not converge.
else:
try: ## catch new basin discovered, rarely
basinId = roundedComplexToBasinId[xOld]
except KeyError as e:
nextBasinId += 1
roundedComplexToBasinId[xOld] = nextBasinId
basinId = nextBasinId
basinIdToBasinSize[basinId] = 0
### print(offset, count, xOld, basinId)
except ZeroDivisionError as e:
basinId = -1
### print(offset, e)
try:
result[iresult,0] = count
except:
print("oops",ix,iy)
if basinId >= 0:
try:
basinIdToBasinSize[basinId] += 1
except KeyError as e:
basinIdToBasinSize[basinId] = 1 # happens with basinId==1
result[iresult,1] = basinId
iresult += 1
ix = ix + step # advance
### offset = offset + step # needed for return value
while ix > ixmax: # re-normalize
ix = ix - ixdim
iy = iy + 1
y = bottom + yrange * iy / iymax
Here’s the python code that writes a binary file of integers and doubles. The doubles follow the integers to make it easy for JMP to grab everything with blobToMatrix.
# output results
fileobj = open(filename, mode='wb') # NoInterleave maybe prepend a descriptor
result.tofile(fileobj) # NoInterleave append after this constant size block
# NoInterleave append roundedComplexToBasinId and basinIdToBasinSize rather than printing them, they should be same dimension
# 4 8-byte doubles: real,imag,basin,size
temp = np.zeros( (len(roundedComplexToBasinId),4),dtype=np.float64)
itemp = 0
for iz in roundedComplexToBasinId.keys():
temp[itemp,0] = iz.real
temp[itemp,1] = iz.imag
temp[itemp,2] = roundedComplexToBasinId[iz]
temp[itemp,3] = basinIdToBasinSize[roundedComplexToBasinId[iz]]
itemp += 1
temp.tofile(fileobj)
fileobj.close()
print("sentinel"); # so JSL can test if this ran to the end...
def init():
offset = 0
step = 1
xdim = 3
ydim = 2
top = -1.0
bottom = 1.0
left = -1.0
right = 1.0
filename = "temp.binary"
options, remainder = getopt.getopt(sys.argv[1:], 's:x:y:o:t:b:l:r:f:')
for opt, arg in options:
if opt == '-o':
offset = int(arg) # offset is the starting point
elif opt == '-s':
step = int(arg) # step to the next point
elif opt == '-x':
xdim = int(arg) # xdim field int size
elif opt == '-y':
ydim = int(arg) # ydim field int size
elif opt == '-t':
top = float(arg) # top of field user coords
elif opt == '-b':
bottom = float(arg) # bot field user coords
elif opt == '-l':
left = float(arg) # left field user coords
elif opt == '-r':
right = float(arg) # right field user coords
elif opt == '-f':
filename = arg # write binary file
print(" xdim=",xdim,
" ydim=",ydim,
" offset=",offset,
" step=",step,
" top=",top,
" bottom=",bottom,
" left=",left,
" right=",right)
return offset, step, xdim, ydim, top, bottom, left, right, filename
if __name__ == "__main__":
main()
]\";
The Python program is saved to the NFS directory (as windows knows it) with a unique number. The returned file name is not used; the Linux version of the filename is created by the launch function and passed via Plink. (We are back in JSL now...)
filename = Save Text File( "\\VBOXSVR\nfsc\farm\testpython"||char(imagenumber)||".py", python ); // NoInterleave - unique file name for each picture because the equation changes
return(filename);
);
The loop below generated the 10,000 frames.
for( imagenumber = 0, imagenumber < 10000, imagenumber += 1, //<<<<< frame loop <<<<<<
write("\!nloop",imagenumber);
// for X^5 - 1 there are 5 roots; 1 real and 4 complex
// the python expression for x(n+1) = x(n) - F / F' is
// xNew = xOld - (xOld ** 5 - 1) / (5 * xOld ** 4)
The formula, and converting OneStep to a Python statement – this looks complicated because it is substituting actual values into the parameters F1...F6 and E1...E6. Hidden at the end is the change from JSL’s ^ to Python’s **, which is done in text, not expression, because JSL has no way to represent ** as an expression. The F1...E6 are handled as expressions. Sin( imagenumber / 200 ) is roughly 10 complete sine wave from -1 to 1. The sine waves are different frequencies because of the different divisors. The factors move smoothly from -2 to 2 and the exponents move smoothly about their 1...6 target by +/- 1. So the equation from one picture to the next is pretty similar, but varying a bit unpredictably.
FF = Derivative( Name Expr( F ), xOld );
// onestep will be inserted into python below
oneStep = substitute( // "xNew = xOld - (xOld ** 5 - 1) / (5 * xOld ** 4)"
char( // "xNew = xOld - (xOld ^ 5 - 1) / (5 * xOld ^ 4)"
substitute( // char() adds the parens around F and FF
Expr( xNew = xOld - F / FF ), // build the statement with symbols, then
Expr( F ), Name Expr( F ), // replace F and FF symbols with the content
Expr( FF ), Name Expr( FF ), // of the function and derivative
Expr( F6 ), round(2 * Sin( imagenumber / (200+5) ) / 2,3),
Expr( F5 ), round(2 * Sin( imagenumber / (200+7) ) / 2,3),
Expr( F4 ), round(2 * Sin( imagenumber / (200+11) ) / 2,3),
Expr( F3 ), round(2 * Sin( imagenumber / (200+13) ) / 2,3),
Expr( F2 ), round(2 * Sin( imagenumber / (200+17) ) / 2,3),
Expr( F1 ), round(2 * Sin( imagenumber / (200+19) ) / 2,3),
Expr( E6 ), round(6.0 + Sin( imagenumber / (200+23) ) / 2,3),
Expr( E5 ), round(5.0 + Sin( imagenumber / (200+29) ) / 2,3),
Expr( E4 ), round(4.0 + Sin( imagenumber / (200+31) ) / 2,3),
Expr( E3 ), round(3.0 + Sin( imagenumber / (200+37) ) / 2,3),
Expr( E2 ), round(2.0 + Sin( imagenumber / (200+41) ) / 2,3),
Expr( E1 ), round(1.0 + Sin( imagenumber / (200+47) ) / 2,3)
) // sub F,FF
), // char
"^", "**"
); // substitute ^ in JSL to ** in python
The imgNumToLabel associative array may be used later to label images. Simplify xOld->X...
imgNumToLabel[imagenumber]=substitute(regex(oneStep,"xNew = xOld - \((.*?)\) / ","\1")," ** ","^"," * ","·","xOld","X");
write("\!n",imgNumToLabel[imagenumber]);
//"/mnt/nfsc/farm/binary" || char(imagenumber) || ".generated"
The test for an existing file was used during development to speed things along when the program had to be restarted. It may not work quite right, it was not used for the final run (three days!)
if(fileexists("\\VBOXSVR\nfsc\farm\binary" || char(imagenumber) || ".generated"),
write("Exists ","\\VBOXSVR\nfsc\farm\binary" || char(imagenumber) || ".generated");
continue();
);
makePythonCode(imagenumber,oneStep);
This looks like out-of-place abandoned code.
n = 0; // eventually this becomes the offset // NoInterleave offset is always 0
sentinels = 0; // NoInterleave "sentinel" might be all it returns
//write("Launching... ");
This is where the 10,000 RunPrograms begin. 50 or 60 will be running at the same time using the condat values, way above.
launch( imagenumber );
write("Launched. ")
);
The wrap up...wait for the last few RunPrograms to end, unstick may find one or two.
while( N Items( imgNumToRP ) > 0,
Wait( .1 );
unstick();
);
write("\!ndone");
The labels get saved for the image generator, later. JSL/Python above already saved binary data that still needs conversion to .png format.
savetextfile("$desktop/imgNumToLabel.txt",char(imgNumToLabel));
The Rpdone is holding 10,000 terminated RunPrograms. Let it go.
imgNumToRPdone = 0;// clean out // NoInterleave picture generation happens after all count and basin are created
The image generator can be run now, or later...
include("pythonFarmDisplay.jsl");
That’s the end of the Beowulf code. The picture generator could have used it, but didn’t. I want the video to use the same colors from one frame to the next. The colors represent basins of attraction, the zero that any point in the basin is attracted to should get the same color. But the equation changes, a little, on every frame, so the zeros are moving around. The picture generator tries to match up the zeros on the previous picture with the next picture. Sometimes they jump, but it is good enough.
The JSL to make the pictures follows. First make a color table for the basins. These are the lightest possible version of the color; they get darkened later by the iteration count being large.
colors=
hlscolor(0,L,1)|| // dark red
hlscolor(.05,L,0.8)|| // red
hlscolor(.15,L,0.9)|| // orn
hlscolor(.35,L,1.0)|| // yel
hlscolor(.50,L,1.0)|| // grn
hlscolor(.65,L,1.0)|| // cyn
hlscolor(.80,L,0.7)|| // blu
hlscolor(.95,L,1.0)|| // mag
hlscolor(1.0,L,0.0); // gry
redmap=j(ncols(colors),1,0);
greenmap=j(ncols(colors),1,0);
bluemap=j(ncols(colors),1,0);
for(i=1,i<=ncols(colors),i+=1,
{r,g,b}=colortorgb(colors[i]);
m = max(r,g,b);
redmap[i] = r/m;
greenmap[i] = g/m;
bluemap[i] = b/m;
);
The image labels saved earlier need reloading, picture dimensions need to match. This code should be the same as earlier, and should have been extracted into a common parameter file. Too late now!
imgNumToLabel=Include( "$desktop/pythonFarmDisplayLabels.jsl" );
scale = 2 / 1;
xresolution = 1920 * scale;
yresolution = 1080 * scale;
// keep aspect ratio by chopping shorter dimension...
xcenter = -1;
ycenter = 0;
delta = 3;
ytop = ycenter - delta * yresolution / xresolution;
xleft = xcenter - delta;
xright = xcenter + delta;
ybottom = ycenter + delta * yresolution / xresolution;
The colors for the basins are supposed to track across slightly different picures/equations. This is where the previous picture’s zeros live.
prevzeros = []; // what came before the first one
The NFS disk was way too small. There is 309GB of uncompressed pixel info here.
//filepat = "\\VBOXSVR\nfsc\farm\binary^imagenumber^.generated";
// manually moved the files (running out of space)...
filepat = "f:\newtonmethod\generated\binary^imagenumber^.generated";
countarray = J( yresolution, xresolution, 0 );
basinarray = J( yresolution, xresolution, 0 );
The binary files will be unpacked into two JSL matrices: a count of how many iterations the pixel needed, and a small integer identifying the pixel’s basin. There is a little more binary data after that (the balance bytes) that takes the basin number to a complex coordinate; the complex coordinate is what gets matched up from one picture to the next.
For( imagenumber = 1, imagenumber < 10000, imagenumber += 1,
file = Eval Insert( filepat );
If( File Exists( file ),
bindata = Load Text File( file, blob() );
matdata = Blob To Matrix( bindata, "uint", 2, "little", 2 );
countArray[0] = matdata[1 :: yresolution * xresolution, 1];
basinArray[0] = matdata[1 :: yresolution * xresolution, 2];
balanceBytes = Length( bindata ) - 4 * yresolution * xresolution;
zeros = Blob To Matrix( Blob Peek( bindata, Length( bindata ) - balanceBytes ), "float", 8, "little", 4 );
//Show( zeros );
if(zeros[1, 3]==2,
zeros = [9e99 9e99 1 0] |/ zeros; // python code should have put this in
);
// re-assign the basin numbers to line up with prevzeros
If( N Rows( prevzeros ) == 0,
prevzeros = zeros
);
The KDTable finds nearest neighbors; the complex plane is a fine neighborhood.
kdt = KDTable( prevzeros[0, 1 :: 2] ); // 1::2 means the real and imag cols
newbasin = (1 :: N Rows( zeros )) + 10000;
toomany = N Rows( zeros ) > N Rows( prevzeros );
For( i = 1, i <= N Rows( zeros ), i += 1,
If( zeros[i, 3] != i,
Throw( "basin index assumption wrong" )
); // zeros are sorted by basin number
{indx, dist} = kdt << knearestrows( 2, zeros[i, 1 :: 2] );
If( N Cols( dist ) > 1 & dist[1] < dist[2] / 10, // assume 10X closer is a good match
newbasin[i] = indx[1];
kdt << removerows( indx[1] );//
, /*else*/ !toomany, // if there are toomany new zeros, don't assign them this way
newbasin[i] = indx[1];
kdt << removerows( indx[1] );
);
);
The <<knearestrows picks the 2 nearest zeros from the old picture to the zero in the new picture. If the first one is at least 10X closer, or if the new picture doesn’t have more zeros than the older picture, the first zero returned is the one we want. When done, the newbasin describes how to rearrange the zeros to match up colors better.
/*
zerobasin newbasin
1 4 change 1s to 4s in basinArray and zeros
2 3 2s to 3s
3 10001 too many case! no basin assigned. becomes 7, below
4 2
5 5
6 6
7 1
*/
unassigned = Loc( newbasin > 10000 );
If( N Rows( unassigned ) > 0,
The new pic has more zeros than the old pic, put them in the last basin.
newbasin[unassigned] = newbasin[unassigned] - 10000 + N Rows( zeros )
);
basinArray = Shape( newbasin[basinArray], N Rows( basinArray ), N Cols( basinArray ) );
zeros[0, 3] = Shape( newbasin[zeros[0, 3]], N Rows( zeros ), 1 );
ri = Rank Index( zeros[0, 3] );
zeros[0, 0] = zeros[ri, 0]; // zeros are sorted by basin number
The zeros are sorted above; row 1 gets color 1, etc.
prevzeros = zeros;
The pictures with high counts are discarded to make the video smoother.
countarray[loc(countarray>200)] = 200; // 150 is typical big count, 999 is not converged
rangecount = Range( countarray );
countarray=countarray-3; // 3 or 4 is typical minimum count, make it bright (0)
countarray[loc(countarray<0)]=0;
The GrayArray is scaled darkness from the counts
grayArray = (1 - (countArray /*- rangeCount[1]*/) / (200 /*rangeCount[2] - rangeCount[1]*/)) ^ 8;
rangebasin = range(basinarray);
//basinArray = basinArray / (7);
nc=ncols(colors);
basinarray[loc(basinarray>nc)]=ncols(colors); // if these exist they could be mapped to gray, perhaps
basinarray[loc(basinarray<1)]=ncols(colors); // ditto
The gray * redmap[basinarray] (etc) is a scaled rgb color component
red = grayArray :* shape(redmap[basinarray], N Rows( basinArray ), N Cols( basinArray ) );//(1 + Sin( 2 * Pi() * basinArray + (0 * 2 * Pi() / 3) )) / 2;
green = grayArray :* shape(greenmap[basinarray], N Rows( basinArray ), N Cols( basinArray ) );//(1 + Sin( 2 * Pi() * basinArray + (1 * 2 * Pi() / 3) )) / 2;
blue = grayArray :* shape(bluemap[basinarray], N Rows( basinArray ), N Cols( basinArray ) );//(1 + Sin( 2 * Pi() * basinArray + (2 * 2 * Pi() / 3) )) / 2;
The fixups for over/underflow the 0..1 color values…
red[Loc( red < 0 )] = 0;
red[Loc( red > 1 )] = 1;
green[Loc( green < 0 )] = 0;
green[Loc( green > 1 )] = 1;
blue[Loc( blue < 0 )] = 0;
blue[Loc( blue > 1 )] = 1;
The + markers are added at the zero locations… crude, but works. Don’t draw on the edge, matrix indexing does whole rows or columns for index of zero!
crosssize = floor(yresolution / 100);
For( iz = 1, iz <= N Rows( zeros ), iz += 1,
real = Round( xresolution * Interpolate( Num( zeros[iz, 1] ), xleft, 0, xright, 1 ) );
imag = Round( yresolution * Interpolate( Num( zeros[iz, 2] ), ytop, 0, ybottom, 1 ) );
if(crosssize+5 < imag < nrows(red)-crosssize-5 & crosssize+5 < real < ncols(red)-crosssize-5,
red[imag - 1, real - crosssize :: real + crosssize] = 0.5;
red[imag + 1, real - crosssize :: real + crosssize] = 0.5;
red[imag - 2, real - crosssize :: real + crosssize] = 0.5;
red[imag + 2, real - crosssize :: real + crosssize] = 0.5;
red[imag - crosssize :: imag + crosssize, real - 1] = 0.5;
red[imag - crosssize :: imag + crosssize, real + 1] = 0.5;
red[imag - crosssize :: imag + crosssize, real - 2] = 0.5;
red[imag - crosssize :: imag + crosssize, real + 2] = 0.5;
green[imag - 1, real - crosssize :: real + crosssize] = 0.5;
green[imag + 1, real - crosssize :: real + crosssize] = 0.5;
green[imag - 2, real - crosssize :: real + crosssize] = 0.5;
green[imag + 2, real - crosssize :: real + crosssize] = 0.5;
green[imag - crosssize :: imag + crosssize, real - 1] = 0.5;
green[imag - crosssize :: imag + crosssize, real + 1] = 0.5;
green[imag - crosssize :: imag + crosssize, real - 2] = 0.5;
green[imag - crosssize :: imag + crosssize, real + 2] = 0.5;
blue[imag - 1, real - crosssize :: real + crosssize] = 0.5;
blue[imag + 1, real - crosssize :: real + crosssize] = 0.5;
blue[imag - 2, real - crosssize :: real + crosssize] = 0.5;
blue[imag + 2, real - crosssize :: real + crosssize] = 0.5;
blue[imag - crosssize :: imag + crosssize, real - 1] = 0.5;
blue[imag - crosssize :: imag + crosssize, real + 1] = 0.5;
blue[imag - crosssize :: imag + crosssize, real - 2] = 0.5;
blue[imag - crosssize :: imag + crosssize, real + 2] = 0.5;
red[imag, real - crosssize :: real + crosssize] = 1;
red[imag - crosssize :: imag + crosssize, real] = 1;
green[imag, real - crosssize :: real + crosssize] = 1;
green[imag - crosssize :: imag + crosssize, real] = 1;
blue[imag, real - crosssize :: real + crosssize] = 1;
blue[imag - crosssize :: imag + crosssize, real] = 1;
);
);
label = imgNumToLabel[imagenumber];
The label is added at the bottom. Make a tiny bitmap of the label, then copy it into the bigger image.
if(1,
//label = char(imagenumber)||" zeros="||char(N Rows( zeros ))||" "||label;
labelpic = textbox(label,setwrap(3000),<<setfontsize(15));
{labelpixr,labelpixg,labelpixb} = (labelpic<<getpicture)<<getpixels("rgb");
tall=nrows(labelpixr);
labelpixr = labelpixr[(tall/9)::(tall-tall/12),0]; // trim
labelpixg = labelpixg[(tall/9)::(tall-tall/12),0]; // trim
labelpixb = labelpixb[(tall/9)::(tall-tall/12),0]; // trim
labelLeft = floor((ncols(red)-ncols(labelpixr))/2);
labelRight = labelLeft + ncols(labelpixr) - 1;
labelBottom = nrows(red);
labelTop = labelBottom - nrows(labelpixr) + 1;
red[labelTop::labelBottom, labelLeft::labelRight] = labelpixr;
green[labelTop::labelBottom, labelLeft::labelRight] = labelpixg;
blue[labelTop::labelBottom, labelLeft::labelRight] = labelpixb;
);
The image is made here. Then decide if it will be kept in the video by naming it.
img = New Image( "rgb", {red, green, blue} );
if(6<=nrows(zeros)<=8 & 2<=rangebasin[1]<=4 & 6<=rangebasin[2]<=8 & 3<=rangecount[1]<=4 & rangecount[2]<200,
img << saveimage( "f:/newtonmethod/png3/keeper" || Char( 1e6 + imagenumber ) || ".png", "png" );
,
img << saveimage( "f:/newtonmethod/png3/zapped" || Char( 1e6 + imagenumber ) || ".png", "png" );
);
//img<<scale(900/yresolution);
//New Window( "rgb", img );
write("\!n"||char(imagenumber)||" nz="||char(nrows(zeros))||" rb="||char(rangebasin)||" rc="||char(rangecount )||" " ||label);
Wait( 0 );
);
);
That’s it! Here’s the video. I decided to only make the scientific version, with flickering dark frames where they happen and a busy caption below. A pretty version could be smoother and even more hypnotic. It is a 4K video, if you have the hardware and connection speed, otherwise youtube will scale it down for you.
There's two hours of 2020 left as I post this. I'm really looking forward to 2021.
runPythonFarmNoInterleaveBlog.jsl
pythonFarmDisplayBlog.jsl
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.