The Multiple File Import tool in JMP can make a data table of the names/size/date of the files it found. The JSL in the attached add-in (JMP 16, Windows only) takes that data table and creates a nested tree view of the disk space used by the files. It is medium speed; it loaded 500K files from the root of the C drive in about 30 minutes (got bored, watched the news, not sure exactly...) There is a viewer for a few file types; JSL shown below.
Not starting at the root is faster.
The JSL starts with a prompt for a directory. This always seems clumsy to use because double click doesn't select a directory...it opens it. Highlight a directory and pick OK. The directory is given to MFI and the file list retrieved. A couple of checks for Cancel are made along the way.
root = Pick Directory( "show directory sizes", "$Desktop/.." );
If( root == "", Throw( "canceled" ));
// use MFI to get a recursive file list with sizes and dates
mfi = Multiple File Import(
<<Set Folder( root ),
<<Set Show Hidden( 0 ),
<<Set Subfolders( 1 ),
<<Set Name Filter( "*.*;" ),
<<Set Name Enable( 0 ),
<<Set Size Filter( {-1, 5936772669} ),
<<Set Size Enable( 0 ),
<<Set Date Filter( {0, 3725849188.969} ),
<<Set Date Enable( 0 )
);
boxtop = mfi << createwindow;
// capture the displayed file names into a data table
dtFiles = boxtop[Table Box( 1 )] << MakeIntoDataTable( invisible( 1 ) );
boxtop << closewindow;
nRowsDtFiles = N Rows( dtFiles );
If( nRowsDtFiles == 1 & Is Missing( dtFiles:FileSize[1] ),
Throw( "canceled?" )
); // this catches the 1st cancel dialog, but not the 2nd
Some modified code from Progress Bar with Cancel Button that will recycle the progress bar window at the end.
// Progressbar https://community.jmp.com/t5/Uncharted/Progress-Bar-with-Cancel-Button/ba-p/433560
progressBarWidth = 500;
cancel = 0; // clear the cancel flag
New Window( "Directory Tree", // this window gets reused below
windowbox = V List Box( //
t = Text Box( "", <<setwrap( 500 ) ), // status of the progressbar
H List Box( // Duct tape has a light side and a dark side. So does the progressbar.
left = Spacer Box( size( 0, 10 ), color( "light green" ) ), //
right = Spacer Box( size( progressBarWidth, 10 ), color( "dark green" ) ), //
<<padding( 5, 5, 5, 5 ), // gray wrapper
<<backgroundcolor( "dark gray" ) //
), //
cancelButton = Button Box( "Cancel", cancel = 1 ) // sets the cancel flag
)
);
Wait( 0 ); // allow the window to open, otherwise the updates are not visible
// use this to update the first half of the progress bar
updateProgress = Function( {fractionComplete},
leftsize = Round( progressBarWidth * fractionComplete * .5 ); // first half of progress bar
rightsize = progressBarWidth - leftsize;
left << width( leftsize );
right << width( rightsize );
t << settext( Eval Insert( "organizing data ^irow^ / ^nRowsDtFiles^" ) );
t << updatewindow; // this works without the wait
);
And it is time to jump in. This code makes a pass over the data table, building a tree that mirrors the shape of the folder tree on disk. When done, the data table is closed because all the data lives in the tree that is in the associative array. The links that connect the tree together are actually keys in the associative array.
// look at every row in the data table and build a tree in an associative array. The keys in the
// array represent directories, NOT files. Files are held in a {list} at each directory level.
// file sizes are accumulated at each of the file's ancestor directory levels, so there is
// a grand total at the root.
init = {{/*immediate files and dirs*/}, 0/*aggregate file count*/, 0.0/*aggregate bytes*/}; // all dir nodes look like this; there is no date for dirs
directory = [=> ]; // associative array to hold the nodes
directory[root] = init; // the root node is handled separately
nBytes = 3;
nDate = nCount = 2;
For( irow = 1, irow <= nRowsDtFiles & !cancel, irow += 1,
levels = Words( dtFiles:FileName[irow], "/" ); // break out the directory levels for this row's file
path = root; // the root is not part of the file's path broken out in levels[] but is needed in the real path spec
size = dtFiles:FileSize[irow];
date = dtFiles:FileDate[irow];
// root handled separately...
directory[root][nCount] += 1; // item 2 in the init list is a count of files here AND below
directory[root][nBytes] += size; // item nBytes is the size of all files here AND below
For( ilevel = 1, ilevel < N Items( levels ), ilevel += 1,
parent = path;
path = path || levels[ilevel] || "\";
If( !Contains( directory, path ), // never seen this node before? create it!
directory[path] = init; // create with an empty list
// add this directory to the parent directory, at position 1 so dirs come first
// Important: the missing value in the 2-item list signals a directory, not date or size.
Insert Into( directory[parent][1], Eval List( {Eval List( {levels[iLevel], .} )} ), 1 ); // my parent dir points to me (I'm also a dir)
);
directory[path][nCount] += 1;// count files and subdirs
directory[path][nBytes] += size;
If( Mod( irow, 1000 ) == 0,
updateProgress( irow / nRowsDtFiles );
Wait( 0 );
);
);
// every row represents a file. This is inserted at the end so files come last, but that is no
// longer important since they are sorted together by descendingaggregate size
// Important: the 2nd item (date) in this list is not missing, so it is a file
If( Is Missing( date ),
Throw( "missing date?" )
);
Insert Into( directory[path][1], Eval List( {Eval List( {levels[iLevel], date, size} )} ) ); // also a 3-item list, not 2 for a dir
);
Close( dtFiles, nosave ); // done with the dt, the data is all loaded in the associative array
Check for cancel, create the pretty icons, set up the second half of the progress bar. Yes, making your own custom icons can be as simple as that! By changing both the left-to-right red-to-gray and the shade of red, the icons are a little more expressive about the huge range of file sizes.
If( cancel, // test the cancel flag
Beep(); // audio flag
left << color( "dark red" ); // visual flag something is awry
right << color( "red" );
cancelButton << setbuttonname( "Close" ); // visual flag the button function is changed
cancelButton << setscript( (cancelButton << closewindow) ); // and what to do if the button is pressed
Stop();
, //
updateProgress( 1.0 ); // people like to see the bar go to 100%
);
// make icons
redcolor = {0.627450980392157, 0.0352941176470588, 0.133333333333333};
graycolor = {0.831372549019608, 0.831372549019608, 0.831372549019608};
ncons = 64;// width, height, and n. the actual displayed icon may be smaller, the shades will help
For( i = 0, i <= ncons, i += 1,
// the width of the red and the shade of the red are both diminishing because
// there is a huge range to cover and this helps express small vs large better
H List Box(
Spacer Box( size( i, ncons ), color( RGB Color( redcolor * (i / ncons) + graycolor * (1 - i / ncons) ) ) ), // left side
Spacer Box( size( ncons - i, ncons ), color( RGB Color( graycolor ) ) ) // right side
) << savepicture( "$temp/deletemeIcon" || Char( i ) || ".png" )
);
biggestsize = Log( directory[root][nBytes] + 1 );// biggest in this tree is dark red square; near 0 is all light gray
icon = Function( {size}, // get the right icon for this file size
size = Floor( ncons * Log( size + 1 ) / biggestsize );
"$temp/deletemeIcon" || Char( size ) || ".png";
);
// update progressbar for 2nd phase this way...
updateProgress2 = Function( {fractionComplete},
leftsize = Round( .5 * progressBarWidth + progressBarWidth * fractionComplete * .5 ); // second half of progress bar
rightsize = progressBarWidth - leftsize;
left << width( leftsize );
right << width( rightsize );
t << settext( Eval Insert( "finishing up display constuction ^filesDone^ / ^nRowsDtFiles^" ) );
t << updatewindow; // this works without the wait
);
The explore function is recursive; it explores the tree in the associative array and builds a sorted tree node control.
// recursive function to explore the tree loaded in the associative array and build the treenode structure
explore = Function( {path, attachpoint},
{treeNode, list, childpath, nfiles, bytes, appendTnode, appendSizes, m, newbutton},
list = directory[path][1]; // all the files and subdirs of this node
// keep two parallel lists of TreeNodes and the size of the object represented
// later, use rank() to sort them so the big objects are at the top of the display
appendTnode = {};
appendSizes = {};
For Each( {namedata}, list, // subdir or file? subdir will recurse, file is a leaf.
If( Is Missing( namedata[nDate] ), // then this is a dir {name, . }
If( N Items( namedata ) != 2,
Throw( "dir? " || Char( namedata ) ) // unexpected bug
);
childpath = path || namedata[1] || "\";
nfiles = directory[childpath][nCount];
bytes = directory[childpath][nBytes];
treeNode = Tree Node(
Eval Insert(
"\[^Format(bytes, "Fixed Dec", Use thousands separator( 1 ), 30, 0 )^ bytes in directory with ^nfiles^ items ^namedata[1]^]\"
)
);
treeNode << seticon( icon( bytes ) );
treeNode << setdata( childpath );
explore( childpath, treeNode ); // recursion! build the subtree
Insert Into( appendTnode, treeNode );
Insert Into( appendSizes, bytes );
, // else a file belonging to path { name, bytes, date}
treeNode = Tree Node(
Eval Insert(
"\[^Format(namedata[nBytes], "Fixed Dec", Use thousands separator( 1 ), 30, 0 )^ bytes updated ^round((today()-namedata[nDate])/inweeks(1))^ weeks ago ^namedata[1]^]\"
)
);
treeNode << seticon( icon( namedata[nBytes] ) );
childpath = path || namedata[1];
treeNode << setdata( childpath );
Insert Into( appendTnode, treeNode );
Insert Into( appendSizes, namedata[nBytes] );
filesDone += 1;
If( Mod( filesDone, 100 ) == 0,
Try( updateProgress2( filesDone / nRowsDtFiles ) );
Wait( 0 );
If( cancel,
Beep(); // audio flag
left << color( "dark red" ); // visual flag something is awry
right << color( "red" );
cancelButton << setbuttonname( "Close" ); // visual flag the button function is changed
cancelButton << setscript( (cancelButton << closewindow) ); // and what to do if the button is pressed
Throw( "canceled" );
);
);
)
);
m = Rank( appendsizes );
For( j = N Items( m ), j >= 1, j--, // biggest first, rank from small to large
i = m[j];
attachpoint << append( appendTnode[i] );
);
);
The TreeClickHandler opens a simple viewer for a few file types.
// click handler for tree nodes
treeClickHandler = Function( {thistree, thisnode}, // a few viewers for common file types
{extension, data, tablename, nodedata, nodelabel},
nodelabel = thisnode << getlabel;
nodedata = thisnode << getdata;
// show(nodelabel,nodedata);
line1 << settext( nodelabel );
line2 << setbuttonname( nodedata );
// show(line2<<getbuttonname);
Try( (view << child) << delete, Show( exception_msg ) );
extension = Regex( nodedata, ".*?([^\.]*)$", "\1" );
If(
Contains( {"png", "jpg", "jpeg", "gif", "bmp"}, Lowercase( extension ) ), //
view << append( New Image( nodedata ) );//
, /*else if*/
Contains( {"txt", "jsl", "csv"}, Lowercase( extension ) ), //
data = Load Text File( nodedata, blob( readlength( 2000 ) ) );
If( Length( data ) < 2000, // arbitrary limit, here, and above
data = Blob To Char( data )
,
data = Blob To Char( data ) || "\!n\!n✁✂✃✄\!n\!ntruncated\!n\!n✁✂✃✄\!n"
);
view << append( Text Box( (data), <<setwrap( 1000 ) ) );//
, /*else if*/
Contains( {"jrn"}, Lowercase( extension ) ), //
view << append( Journal Box( Load Text File( nodedata ) ) );//
, /*else if*/
Contains( {"jmp"}, Lowercase( extension ) ), //
tablename = nodedata;
If( File Size( tablename ) < 1e6, // arbitrary limit to prevent slow down, here and below
dt = Open( tablename, invisible );
view << append( Journal Box( Data Table Box( dt ) << get journal ) );
Close( dt, nosave );//
, // else
view << append( Text Box( "table>1MB, too big for this viewer, click link above to open it" ) );//
);//
, /*else if*/
Ends With( extension, "\" ), //
view << append( Text Box( "This is a directory. Click the link above to open it." ) );//
, // else
view << append( Text Box( "Not sure what this is, don't click the link unless you want JMP to open() it." ) );//
);
);
Finally, make the root node, pass it to explore(), make a window with a Tree Box to show the tree node control. There is a button that will open() files; you should be careful pressing it. There is (not shown) a short list of executable file extensions, but certainly not comprehensive, to try to keep you out of trouble...
rootTreeNode = Tree Node( Eval Insert( "\[^Format(directory[root][nBytes], "Fixed Dec", Use thousands separator( 1 ), 30, 0 )^ ^root^]\" ) );
rootTreeNode << seticon( icon( directory[root][nBytes] ) );
explore( root, rootTreeNode );
windowbox = windowbox << parent;
(windowbox << child) << delete;
windowbox << append(
H Splitter Box(
size( 1000, 600 ),
treebox = Tree Box( rootTreeNode ),
vlistbox = V List Box(
line1 = Text Box(), // also updated by treeClickHandler
line2 = Button Box( "", // gets link-style, below. Visible name updated by treeClickHandler script.
setfunction( // when the button clicks, use its current visible name to open() the file.
Function( {this},
name = this << getbuttonname;// the underlined button name *is* the file name
extension = Uppercase( Regex( name, ".*?([^\.]*)$", "\1" ) );
If( !Contains( executable, extension ), // seems a bad idea
Open( name, Add to Recent Files( 0 ) )
,
Beep()
);
)
)
),
scrollbox = Scroll Box( view = Border Box() )
)
)
);
windowbox << Set Window Title( "Directory Tree " || root );
line2 << UnderlineStyle( 1 );
line1 << Set Stretch( "fill", "off" );
line2 << Set Stretch( "fill", "off" );
scrollbox << Set Stretch( "off", "off" );
treebox << Set Auto Stretching( 1, 1 ) << Set Max Size( 10000, 10000 ) << User Resizable( {0, 0} );
scrollbox << Set Auto Stretching( 1, 1 ) << Set Max Size( 10000, 10000 ) << User Resizable( {0, 0} );
treebox << Set Node Select Script( treeClickHandler );
treebox << expand( rootTreeNode );
A picture
A table
A journal
If you install the add-in, you can peek at the JSL using view->addins. You can also download it and rename it with a .zip extension.
(First pass of this JSL used outline nodes, which allow for a lot more bells and whistles than the tree nodes. But outline nodes are not well suited to displaying 100K+ nested nodes; tree nodes are designed for that job. If you spot some left over outline references (I think I cleaned them up...) that's what happened.)