I started this project by making a collection of 13,000 512x512 AI images. The next step is to load some meta data about the images into a data table. Later I'll be choosing images with similar colors to replace parts of a zoomed in image. The table looks like this:
The H,L,S and R,G,B values match the swatch color. The color should match some color in the sample.
This loads the filenames, but not the pictures yet.
imageDir = "\\VBOXSVR\images\"; // ~13,000 512x512 AI generated images
files = Files In Directory( imageDir );
nf = N Items( files );
nfolds = 3; // the images are divided up into 3 sets to make sure the same images don't appear too soon
dt = New Table( "lookup",
Add Rows( nf ),
Set Cell Height( 51 ),
New Column( "filename", Character, setvalues( files ), Set Display Width( 356 ) ),
New Column( "wide" ),
New Column( "tall" ),
New Column( "h_" ), // several nearest neighbor techniques tried;
New Column( "l_" ), // the video uses HLS to find similar colors
New Column( "s_" ),
New Column( "r" ),
New Column( "g" ),
New Column( "b" ),
New Column( "sample", Expression, "None", Set Display Width( 85 ) ),
New Column( "swatch", Expression, "None", Set Display Width( 81 ) ),
New Column( "fold123", formula( Mod( Row(), nfolds ) ) ),
New Column( "nUsed", Numeric, "Nominal", <<Set Each Value( 0 ) )
);
To get a patch of color from the sample image, use HLS rather than RGB so similar hues can be grouped together, and bin the hues into 32 distinct hues. Load each image, put the reduced size picture in the sample column (just for debugging.) In the following code, {hue, lite, sat} = Color To HLS( img << getpixels ); stores a 512x512 matrix into the hue, sat, and lite variables. We'll be doing element-wise matrix math on them.
One of the issues with HLS is the H (hue, what you might think of as color) part is ignored for shades of gray, including black and white. I'm choosing to use H=1.0 for any color that is close to gray so they bin together. If the dominant patch of color is really dark yellow, blue, etc, it will be binned as black.
After binning the H,L,S values that were zero to one (multiply by nbuckets (32) and floor the result), there is a statement to make an array named total, full of zeros. Then a statement that initializes it using the ++ operator and a complicated subscript. Later when LocMax retrieves the index of the largest total, the index can be reversed back into H,L,S values to determine what color was there. (Originally I thought there would be more than 32*32*32 bins and the space optimization might have been more important.)
The while-keeplooking loop tries to find a non-gray scale color with at least 100 pixels in the image; if not possible the biggest patch of binned gray is used.
nbuckets = 32; // the HLS values are binned into 32 ranges to find similar colors
nbucketsPlus = nbuckets + 1;
For Each Row(
dt,
path = imageDir || dt:filename;
img = New Image( path );
{dt:wide, dt:tall} = img << size;
{hue, lite, sat} = Color To HLS( img << getpixels );
img << scale( .125 );
dt:sample = img; // for debugging, mostly, keep a small copy in the datatable
If( Any( hue >= 1 ),
Show( Row(), Loc( hue >= 1 ) );
Throw( "bad assumptions that hue<1" );
); // use to flag gray scale colors...
// unsaturated is gray; force the hue to 1. Real hues are 0 to .9999
// similar for very low or high lightness
hue[Loc( sat < 0.4 )] = 1; // gray if unsaturated, may be redundant with one of...
hue[Loc( lite < 0.3 )] = 1; // black if low lightness
hue[Loc( lite > 0.7 )] = 1; // white if high lightness
// binning
hue = Floor( hue * nbuckets );//show(hue) 0..8 inclusive
lite = Floor( lite * nbuckets );//show(lite) 0..8 inclusive
sat = Floor( sat * nbuckets );//show(sat) 0..8 inclusive
total = J( nbucketsPlus * nbucketsPlus * nbucketsPlus, 1, 0 );
// HLS is encoded into the subscript for total[]... decoded further below...
total[((nbucketsPlus ^ 2) * hue + nbucketsPlus * lite + sat + 1)[0]]++;
// the HLS chosen for this picture is the largest patch of a
// binned color that is colorful, or gray if no color seems to stand out...
keeplooking = 1;
firstidx = 0;
While( keeplooking,
idxPlus = Loc Max( total ); // a large bin. Often the first one is not a colorful one.
idx = idxPlus - 1;
If( firstidx == 0,
firstidx = idx // But the first one is the gray level if all are non colorful
);
// arbitrarily, there must be at least 100 colorful pixels of the same binned color,
// or use gray...
If( total[idxPlus] < 100, // no color found, use the original gray and the hue will be 1
Write( "\!nOriginal gray" );
dt:h_ = (Floor( firstidx / (nbucketsPlus ^ 2) ) + .5) / nbuckets;
If( dt:h_ < 1,
Throw( "hue?" )
);
firstidx = Mod( firstidx, (nbucketsPlus ^ 2) );
dt:l_ = (Floor( firstidx / (nbucketsPlus) ) + .5) / nbuckets;
firstidx = Mod( firstidx, (nbucketsPlus) );
dt:s_ = (firstidx + .5) / nbuckets;
keeplooking = 0;//
, //
dt:h_ = Floor( idx / (nbucketsPlus ^ 2) ); // decode HLS uses 0-based idx
If( dt:h_ == nbuckets,
//Write( "\!nignore gray" );
total[idxPlus] = 0;// try another until get a color
, //
dt:h_ = (dt:h_ + .5) / nbuckets; // 0 to 1 recovered from binned index
Write( "\!ngot color" );
idx = Mod( idx, (nbucketsPlus ^ 2) );
dt:l_ = (Floor( idx / (nbucketsPlus) ) + .5) / nbuckets;
idx = Mod( idx, (nbucketsPlus) );
dt:s_ = (idx + .5) / nbuckets;
keeplooking = 0;
);
);
);
If( dt:h_ >= 1,
dt:h_ = 1;
dt:s_ = 0;
);
color = HLS Color( dt:h_, dt:l_, dt:s_ );
{dt:r, dt:g, dt:b} = Color To RGB( color ); // later use the RGB equivalent for kdtable nearest neighbor
dt:swatch = New Image( J( 16, 16, color ) ); // for debugging, need to see a swatch of the chosen color
If( Mod( Row(), 10 ) == 0,
dt << gotorow( Row() );
Wait( 0 );
);
);
For Each Row( dt, Row State( dt, Row() ) = Combine States( Marker State( 12 ), Color State( RGB Color( r, g, b ) ) ) );
A KDTable finds nearest neighbors in a multidimensional space. I'm using 4 dimensions here; the extra one creates three sets of pictures that almost always stay separated.
kdlookup = KDTable( dt[0, {r, g, b, fold123}] );
Next are some constants and some pixel arrays and a starting image.
VideoRows = 1080;
VideoCols = 1920;
VideoSize = Eval List( List( videocols, videorows ) );
FudgeScale = 1.00001; // fix rounding error in <<scale -- fractional scales truncate
// big and small are both actually the big size so they line up; small is scaled up
// to make it align. the slices will be combined, added together, and scaled to VideoSize.
BigBitmap = J( 512 * 48 * 9 / 16, 512 * 48, 0 );
SmlBitmap = J( 512 * 48 * 9 / 16, 512 * 48, 0 );
// this is the starting image which will be final of the reversed first sequence,it is a little too big
ptemp = New Image( imageDir || "PXL_20230505_103556694.png" );
{ncols, nrows} = ptemp << size;
ptemp << crop(
Left( Floor( (ncols - 1920) / 2 ) ),
Right( Floor( (ncols - 1920) / 2 ) + 1920 ),
top( Floor( (nrows - 1080) / 2 ) ),
bottom( Floor( (nrows - 1080) / 2 ) + 1080 )
);
{ncols, nrows} = ptemp << size;
If( Abs( N Cols( BigBitmap ) / ncols - N Rows( BigBitmap ) / nrows ) > 1e-14,
Throw( "xxx", traceback )
);
upscale = N Rows( BigBitmap ) / nrows;
ptemp << scale( FudgeScale * upscale );
{ncols, nrows} = ptemp << size;
If( ncols != N Cols( BigBitmap ) | nrows != N Rows( BigBitmap ),
Throw( Char( ncols ) || " " || Char( nrows ), traceback )
);
// small is initialized to the start/final image which was just upscaled
SmlBitmap[0, 0] = -(ptemp << getpixels); // the normally negative jmp colors are kept as positive in big and small
ptemp = 0;//release
Here's the work horse function that finds similar bitmaps. Originally the "small" bitmap was much smaller than the "big" one, but now they are both the same big size to keep the pixels aligned. This function makes a new big bitmap that has 512x512 images chosen to have a similar color to the average color of the same area in the small bitmap. Later big and small will be blended and scaled to make a video frame.
bigFromSml = Function( {fold}, // makes a new big from the 13K 512x512 choices that aligns with small
{irow, icol, patch, red, green, blue, rows, dists, path, img},
For( irow = 0, irow < N Rows( SmlBitmap ), irow += 512,
Show( irow );
Wait( 0 );
srcrow = Floor( irow ) + 1 :: Floor( irow ) + 512;
For( icol = 0, icol < N Cols( SmlBitmap ), icol += 512,
srccol = Floor( icol ) + 1 :: Floor( icol ) + 512;
patch = SmlBitmap[srcrow, srccol]; // carve out a patch of 512x512 pixels from small
blue = Mod( patch, 256 );
green = Mod( patch -= blue, 256 * 256 );
red = Mod( patch -= green, 256 * 256 * 256 );
// 0..255
blue = Mean( blue );
green = Mean( green ) / 256;
red = Mean( red ) / (256 * 256);
// lookup a bitmap with similar color
nchoices = 5; // pick one of these nearest choices
{rows, dists} = kdlookup << KNearestRows( nchoices, red / 255 || green / 255 || blue / 255 || fold );
If( Mean( dists ) > 1.1,
// if the lookup is returning far-away answers, recycle old pictures
Show( kdlookup << insertrows( 1 :: N Rows( dt ) ) )
);
choice = rows[Random Integer( 1, N Items( rows ) )]; // randomly pick one of the 5 best choices
dt:nUsed[choice] += 10 ^ (4 * fold);// debugging
kdlookup << removeRows( choice );// don't reuse this picture until recycled
path = imageDir || dt:filename[choice];
img = Try(
New Image( path )// during development I might delete a file while this is running
, //
Show( path, choice, exception_msg );
choice = 1;
kdlookup << insertrows( choice );
New Image( imageDir || dt:filename[choice] ); // see if row 1 can keep it going
);
{c, r} = img << size;
img << scale( FudgeScale * 512 / c, FudgeScale * 512 / r ); // a few images need fixups
BigBitmap[srcrow, srccol] = -(img << getpixels); // same location for small and big. store positive jmpcolor
);
)
);
fold = 0;
bigFromSml( fold ); // make the original big before the loop, using the small image
Before running the picture generating loop, a couple more constants, one calculated by tweaking some numbers. The 12.8 took a while to understand--it makes the zoom in speed appear constant. As the zoom in on the blended big and small bitmaps proceeds, fewer and fewer pixels have to be removed around the edges to get the same percent zoom factor. 12.8 is the ratio of the initial to final zoom factors.
outdir = "\\VBOXSVR\nfsc\deeppic\"; // the video uses a part1 and part2 sequence
// maxrows is the maximum rows that can be removed from the top and bottom without needing to stretch the middle to the VideoRows
maxrows = (N Rows( SmlBitmap ) - VideoRows) / 2;
countmaxrows = 0;
startrows = 48.705; // 48.705 hand picked for 360 frames...6 second transitions
endrows = startrows / (N Rows( SmlBitmap ) / VideoRows);// 12.8 is ratio of start image size to end image size
For( rowReduce = 1, rowReduce <= maxrows, rowReduce += Interpolate( rowReduce, 0, startrows, maxrows, endrows ),
countmaxrows += 1
);
Show( 361, countmaxrows, startrows, endrows );// targeting 180, no, lets use 360 at 60FPS, frames (181-1) (361-1)
Finally, the loop that emits the pictures. It is a nested loop; the outer loop calls the bigFromSmall function to prepare the inner loop to make another 360 images. The inner loop grabs smaller and smaller patches of pixels from the big and small bitmaps, scales them, blends them, and writes the blended images to disk.
// 1e12, etc for naming the output files, this loop is not intended to finish; interupt it when there are enough frames
For( iround = 1e12, iround <= 9e12, iround += 1e6,
Show( iround );
Wait( 0.01 );
pictureNumber = 0;
For( rowReduce = 0, rowReduce <= maxrows, rowReduce += Interpolate( rowReduce, 0, startrows, maxrows, endrows ),
If( rowReduce == 0 & iround != 1e12,
Continue()
);// keep first, later it is an expensive duplicate
colReduce = 16 * rowReduce / 9;
// each image is a blend of small and big; the slice from small and big is combined
// and scaled to the video size. the slices get smaller and smaller and need to be scaled
// down less and less until the middle slice needs a scale of 1:1
start = HP Time();
smlPatch = SmlBitmap[Round( rowReduce ) + 1 :: N Rows( SmlBitmap ) - Round( rowReduce ), Round( colReduce ) + 1 :: N Cols( SmlBitmap ) - Round( colReduce )];
smlImg = New Image( -smlPatch );
{ncols, nrows} = smlImg << size;
smlImg << scale( FudgeScale * VideoCols / ncols, FudgeScale * VideoRows / nrows );
If( smlImg << size != VideoSize,
Throw( Char( smlImg << size ), traceback )
);
{smlred, smlgreen, smlblue} = smlImg << getpixels( "rgb" );
bigPatch = BigBitmap[Round( rowReduce ) + 1 :: N Rows( SmlBitmap ) - Round( rowReduce ), Round( colReduce ) + 1 :: N Cols( SmlBitmap ) - Round( colReduce )];
bigImg = New Image( -bigPatch );
{ncols, nrows} = bigImg << size;
bigImg << scale( FudgeScale * VideoCols / ncols, FudgeScale * VideoRows / nrows );
If( bigImg << size != VideoSize,
Throw( Char( bigImg << size ), traceback )
);
{bigred, biggreen, bigblue} = bigImg << getpixels( "rgb" );
stop = HP Time();
// blend and save the scaled slices
alpha = 1 - (pictureNumber / countmaxrows);
beta = 1 - alpha;
combine = New Image( "rgb", {bigred * beta + smlred * alpha, biggreen * beta + smlgreen * alpha, bigblue * beta + smlblue * alpha} );
combine << savepicture( outdir || Char( iround + (pictureNumber += 1) ) || ".jpg", "jpg" );
Write( "\!n" || Char( (stop - start) / 1e6, 6 ) || " " || Char( iround - 1e12 ) || " " || Char( pictureNumber ) );
Wait( 0.01 );
);
// combine becomes the new smlBitmap
combine << scale( FudgeScale * upscale );
{ncols, nrows} = combine << size;
If( ncols != N Cols( BigBitmap ) | nrows != N Rows( BigBitmap ),
Throw( Char( ncols ) || " " || Char( nrows ), traceback )
);
SmlBitmap[0, 0] = -(combine << getpixels);
combine = 0;//release
// and make a new big bitmap from small
fold = Mod( fold + 1, nfolds );
bigFromSml( fold );
);
Once the pictures are created, the Blender non-linear editor can load them and make a video. I went with 60FPS rather than more spatial resolution because the "big" bitmaps above are already on the edge of too big.
Blender is open source; I use it on Linux. I think Win and Mac versions are available. It has a long learning curve.
All done. I like the transitions; between choosing the colors and blending it works well.
If you are back in the office, you might want to turn the volume down a little before you start.
@hogi Glad you watched far enough to see the reverse! The first 3/5 is actually played backwards from the way it was generated by the JSL. The first part ends at the starting frame. The second 2/5 is played forwards so it begins at the same image the first part ends on. The two sequences are different because of the randomly chosen best of 5 choices...they were separately generated.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.