cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
Craige_Hales
Super User
Directory Tree: Explore Space Used by Folders

The Multiple File Import tool in JMP can make a data table of the names/size/date of the files it found. The JSL in the attached add-in (JMP 16, Windows only) takes that data table and creates a nested tree view of the disk space used by the files. It is medium speed; it loaded 500K files from the root of the C drive in about 30 minutes (got bored, watched the news, not sure exactly...)  There is a viewer for a few file types; JSL shown below.

Not starting at the root is faster.Not starting at the root is faster.

The JSL starts with a prompt for a directory. This always seems clumsy to use because double click doesn't select a directory...it opens it. Highlight a directory and pick OK. The directory is given to MFI and the file list retrieved. A couple of checks for Cancel are made along the way.

root = Pick Directory( "show directory sizes", "$Desktop/.." );
If( root == "", Throw( "canceled" ));


// use MFI to get a recursive file list with sizes and dates
mfi = Multiple File Import(
    <<Set Folder( root ),
    <<Set Show Hidden( 0 ),
    <<Set Subfolders( 1 ),
    <<Set Name Filter( "*.*;" ),
    <<Set Name Enable( 0 ),
    <<Set Size Filter( {-1, 5936772669} ),
    <<Set Size Enable( 0 ),
    <<Set Date Filter( {0, 3725849188.969} ),
    <<Set Date Enable( 0 )
);
boxtop = mfi << createwindow;
// capture the displayed file names into a data table
dtFiles = boxtop[Table Box( 1 )] << MakeIntoDataTable( invisible( 1 ) );
boxtop << closewindow;
nRowsDtFiles = N Rows( dtFiles );
If( nRowsDtFiles == 1 & Is Missing( dtFiles:FileSize[1] ),
    Throw( "canceled?" )
); // this catches the 1st cancel dialog, but not the 2nd

Some modified code from Progress Bar with Cancel Button  that will recycle the progress bar window at the end.

// Progressbar    https://community.jmp.com/t5/Uncharted/Progress-Bar-with-Cancel-Button/ba-p/433560
progressBarWidth = 500;
cancel = 0; // clear the cancel flag
New Window( "Directory Tree", // this window gets reused below
    windowbox = V List Box( //
        t = Text Box( "", <<setwrap( 500 ) ), // status of the progressbar
        H List Box( // Duct tape has a light side and a dark side. So does the progressbar.
            left = Spacer Box( size( 0, 10 ), color( "light green" ) ), //
            right = Spacer Box( size( progressBarWidth, 10 ), color( "dark green" ) ), //
            <<padding( 5, 5, 5, 5 ), // gray wrapper
            <<backgroundcolor( "dark gray" ) //
        ), //
        cancelButton = Button Box( "Cancel", cancel = 1 ) // sets the cancel flag
    )
);
Wait( 0 ); // allow the window to open, otherwise the updates are not visible
// use this to update the first half of the progress bar
updateProgress = Function( {fractionComplete},
    leftsize = Round( progressBarWidth * fractionComplete * .5 ); // first half of progress bar
    rightsize = progressBarWidth - leftsize;
    left << width( leftsize );
    right << width( rightsize );
    t << settext( Eval Insert( "organizing data ^irow^ / ^nRowsDtFiles^" ) );
    t << updatewindow; // this works without the wait	
);

And it is time to jump in. This code makes a pass over the data table, building a tree that mirrors the shape of the folder tree on disk. When done, the data table is closed because all the data lives in the tree that is in the associative array. The links that connect the tree together are actually keys in the associative array.


// look at every row in the data table and build a tree in an associative array. The keys in the
// array represent directories, NOT files. Files are held in a {list} at each directory level.
// file sizes are accumulated at each of the file's ancestor directory levels, so there is
// a grand total at the root.
init = {{/*immediate files and dirs*/}, 0/*aggregate file count*/, 0.0/*aggregate bytes*/}; // all dir nodes look like this; there is no date for dirs
directory = [=> ]; // associative array to hold the nodes
directory[root] = init; // the root node is handled separately
nBytes = 3;
nDate = nCount = 2;
For( irow = 1, irow <= nRowsDtFiles & !cancel, irow += 1,
    levels = Words( dtFiles:FileName[irow], "/" ); // break out the directory levels for this row's file
    path = root; // the root is not part of the file's path broken out in levels[] but is needed in the real path spec
    size = dtFiles:FileSize[irow];
    date = dtFiles:FileDate[irow];
    // root handled separately...
    directory[root][nCount] += 1; // item 2 in the init list is a count of files here AND below
    directory[root][nBytes] += size; // item nBytes is the size of all files here AND below
    For( ilevel = 1, ilevel < N Items( levels ), ilevel += 1,
        parent = path;
        path = path || levels[ilevel] || "\";
        If( !Contains( directory, path ), // never seen this node before? create it!
            directory[path] = init; // create with an empty list
            // add this directory to the parent directory, at position 1 so dirs come first
            // Important: the missing value in the 2-item list signals a directory, not date or size.
            Insert Into( directory[parent][1], Eval List( {Eval List( {levels[iLevel], .} )} ), 1 ); // my parent dir points to me (I'm also a dir)
        );
        directory[path][nCount] += 1;// count files and subdirs
        directory[path][nBytes] += size;
        
        If( Mod( irow, 1000 ) == 0,
            updateProgress( irow / nRowsDtFiles );
            Wait( 0 );
        );
    );
    // every row represents a file. This is inserted at the end so files come last, but that is no 
    // longer important since they are sorted together by descendingaggregate size
    // Important: the 2nd item (date) in this list is not missing, so it is a file
    If( Is Missing( date ),
        Throw( "missing date?" )
    );
    Insert Into( directory[path][1], Eval List( {Eval List( {levels[iLevel], date, size} )} ) ); // also a 3-item list, not 2 for a dir
);

Close( dtFiles, nosave ); // done with the dt, the data is all loaded in the associative array

Check for cancel, create the pretty icons, set up the second half of the progress bar. Yes, making your own custom icons can be as simple as that! By changing both the left-to-right red-to-gray and the shade of red, the icons are a little more expressive about the huge range of file sizes.


If( cancel, // test the cancel flag
    Beep(); // audio flag
    left << color( "dark red" ); // visual flag something is awry
    right << color( "red" );
    cancelButton << setbuttonname( "Close" ); // visual flag the button function is changed
    cancelButton << setscript( (cancelButton << closewindow) ); // and what to do if the button is pressed
    Stop();
, //
    updateProgress( 1.0 ); // people like to see the bar go to 100%
);

// make icons
redcolor = {0.627450980392157, 0.0352941176470588, 0.133333333333333};
graycolor = {0.831372549019608, 0.831372549019608, 0.831372549019608};
ncons = 64;// width, height, and n.  the actual displayed icon may be smaller, the shades will help
For( i = 0, i <= ncons, i += 1, 
    // the width of the red and the shade of the red are both diminishing because 
    // there is a huge range to cover and this helps express small vs large better
    H List Box(
        Spacer Box( size( i, ncons ), color( RGB Color( redcolor * (i / ncons) + graycolor * (1 - i / ncons) ) ) ), // left side
        Spacer Box( size( ncons - i, ncons ), color( RGB Color( graycolor ) ) ) // right side
    ) << savepicture( "$temp/deletemeIcon" || Char( i ) || ".png" )
);
biggestsize = Log( directory[root][nBytes] + 1 );// biggest in this tree is dark red square; near 0 is all light gray
icon = Function( {size}, // get the right icon for this file size
    size = Floor( ncons * Log( size + 1 ) / biggestsize );
    "$temp/deletemeIcon" || Char( size ) || ".png";
);


// update progressbar for 2nd phase this way...
updateProgress2 = Function( {fractionComplete},
    leftsize = Round( .5 * progressBarWidth + progressBarWidth * fractionComplete * .5 ); // second half of progress bar
    rightsize = progressBarWidth - leftsize;
    left << width( leftsize );
    right << width( rightsize );
    t << settext( Eval Insert( "finishing up display constuction ^filesDone^ / ^nRowsDtFiles^" ) );
    t << updatewindow; // this works without the wait	
);

The explore function is recursive; it explores the tree in the associative array and builds a sorted tree node control.

    // recursive function to explore the tree loaded in the associative array and build the treenode structure
    explore = Function( {path, attachpoint},
        {treeNode, list, childpath, nfiles, bytes, appendTnode, appendSizes, m, newbutton},
        list = directory[path][1]; // all the files and subdirs of this node
        // keep two parallel lists of TreeNodes and the size of the object represented
        // later, use rank() to sort them so the big objects are at the top of the display
        appendTnode = {};
        appendSizes = {};
        For Each( {namedata}, list, // subdir or file? subdir will recurse, file is a leaf.
            If( Is Missing( namedata[nDate] ), // then this is a dir {name, . }
                If( N Items( namedata ) != 2,
                    Throw( "dir? " || Char( namedata ) ) // unexpected bug
                );
                childpath = path || namedata[1] || "\";
                nfiles = directory[childpath][nCount];
                bytes = directory[childpath][nBytes];
                treeNode = Tree Node(
                    Eval Insert(
                        "\[^Format(bytes, "Fixed Dec", Use thousands separator( 1 ), 30, 0 )^ bytes in directory with ^nfiles^ items ^namedata[1]^]\"
                    )
                );
                treeNode << seticon( icon( bytes ) );
                treeNode << setdata( childpath );
                explore( childpath, treeNode ); // recursion! build the subtree
                Insert Into( appendTnode, treeNode );
                Insert Into( appendSizes, bytes );
            , // else a file belonging to path { name, bytes, date}
                treeNode = Tree Node(
                    Eval Insert(
                        "\[^Format(namedata[nBytes], "Fixed Dec", Use thousands separator( 1 ), 30, 0 )^ bytes   updated ^round((today()-namedata[nDate])/inweeks(1))^ weeks ago   ^namedata[1]^]\"
                    )
                );
                treeNode << seticon( icon( namedata[nBytes] ) );
                childpath = path || namedata[1];
                treeNode << setdata( childpath );
                Insert Into( appendTnode, treeNode );
                Insert Into( appendSizes, namedata[nBytes] );
                filesDone += 1;
                If( Mod( filesDone, 100 ) == 0,
                    Try( updateProgress2( filesDone / nRowsDtFiles ) );
                    Wait( 0 );
                    If( cancel,
                        Beep(); // audio flag
                        left << color( "dark red" ); // visual flag something is awry
                        right << color( "red" );
                        cancelButton << setbuttonname( "Close" ); // visual flag the button function is changed
                        cancelButton << setscript( (cancelButton << closewindow) ); // and what to do if the button is pressed
                        Throw( "canceled" );
                    );
                );
            )
        );
        m = Rank( appendsizes );
        For( j = N Items( m ), j >= 1, j--, // biggest first, rank from small to large
            i = m[j];
            attachpoint << append( appendTnode[i] );
        );
    );

The TreeClickHandler opens a simple viewer for a few file types.

    // click handler for tree nodes
    treeClickHandler = Function( {thistree, thisnode}, // a few viewers for common file types
        {extension, data, tablename, nodedata, nodelabel},
        nodelabel = thisnode << getlabel;
        nodedata = thisnode << getdata;
       // show(nodelabel,nodedata);
        line1 << settext( nodelabel );
        line2 << setbuttonname( nodedata );
       // show(line2<<getbuttonname);
        Try( (view << child) << delete, Show( exception_msg ) );
        extension = Regex( nodedata, ".*?([^\.]*)$", "\1" );
        If(
            Contains( {"png", "jpg", "jpeg", "gif", "bmp"}, Lowercase( extension ) ),  //
                view << append( New Image( nodedata ) );//
        , /*else if*/
            Contains( {"txt", "jsl", "csv"}, Lowercase( extension ) ), //
                data = Load Text File( nodedata, blob( readlength( 2000 ) ) );
                If( Length( data ) < 2000, // arbitrary limit, here, and above
                    data = Blob To Char( data )
                ,
                    data = Blob To Char( data ) || "\!n\!n✁✂✃✄\!n\!ntruncated\!n\!n✁✂✃✄\!n"
                );
                view << append( Text Box( (data), <<setwrap( 1000 ) ) );//
        , /*else if*/
            Contains( {"jrn"}, Lowercase( extension ) ),  //
                view << append( Journal Box( Load Text File( nodedata ) ) );//
        , /*else if*/
            Contains( {"jmp"}, Lowercase( extension ) ),  //
                tablename = nodedata;
                If( File Size( tablename ) < 1e6, // arbitrary limit to prevent slow down, here and below
                    dt = Open( tablename, invisible );
                    view << append( Journal Box( Data Table Box( dt ) << get journal ) );
                    Close( dt, nosave );//
                , // else
                    view << append( Text Box( "table>1MB, too big for this viewer, click link above to open it" ) );//
                );//
        , /*else if*/
            Ends With( extension, "\" ), //
                view << append( Text Box( "This is a directory. Click the link above to open it." ) );//
        , // else
            view << append( Text Box( "Not sure what this is, don't click the link unless you want JMP to open() it." ) );//
        );
    );

Finally, make the root node, pass it to explore(), make a window with a Tree Box to show the tree node control. There is a button that will open() files; you should be careful pressing it. There is (not shown) a short list of executable file extensions, but certainly not comprehensive, to try to keep you out of trouble...

    rootTreeNode = Tree Node( Eval Insert( "\[^Format(directory[root][nBytes], "Fixed Dec", Use thousands separator( 1 ), 30, 0 )^ ^root^]\" ) );

    rootTreeNode << seticon( icon( directory[root][nBytes] ) );

    explore( root, rootTreeNode );
    windowbox = windowbox << parent;
    (windowbox << child) << delete;
    windowbox << append(
        H Splitter Box(
            size( 1000, 600 ),
            treebox = Tree Box( rootTreeNode ),
            vlistbox = V List Box(
                line1 = Text Box(), // also updated by treeClickHandler
                line2 = Button Box( "", // gets link-style, below. Visible name updated by treeClickHandler script.
                    setfunction( // when the button clicks, use its current visible name to open() the file. 
                        Function( {this},
                            name = this << getbuttonname;// the underlined button name *is* the file name
                            extension = Uppercase( Regex( name, ".*?([^\.]*)$", "\1" ) );
                            If( !Contains( executable, extension ), // seems a bad idea
                                Open( name, Add to Recent Files( 0 ) )
                            ,
                                Beep()
                            );
                        )
                    )
                ),
                scrollbox = Scroll Box( view = Border Box() )
            )
        )
    );
    windowbox << Set Window Title( "Directory Tree " || root );
    line2 << UnderlineStyle( 1 );
    line1 << Set  Stretch( "fill", "off" );
    line2 << Set  Stretch( "fill", "off" );
    scrollbox << Set  Stretch( "off", "off" );
    treebox << Set Auto Stretching( 1, 1 ) << Set Max Size( 10000, 10000 ) << User Resizable( {0, 0} );
    scrollbox << Set Auto Stretching( 1, 1 ) << Set Max Size( 10000, 10000 ) << User Resizable( {0, 0} );
    treebox << Set Node Select Script( treeClickHandler );
    treebox << expand( rootTreeNode );

 

A pictureA picture

 

A tableA table

 

A journalA journal

If you install the add-in, you can peek at the JSL using view->addins. You can also download it and rename it with a .zip extension.

 

(First pass of this JSL used outline nodes, which allow for a lot more bells and whistles than the tree nodes. But outline nodes are not well suited to displaying 100K+ nested nodes; tree nodes are designed for that job. If you spot some left over outline references (I think I cleaned them up...) that's what happened.)

Last Modified: Jan 27, 2022 10:18 PM
Comments