It’s World Statistics Day! To honor the theme of the day, the JMP User Community is having conversations about the importance of trust in statistics and data. And we want to hear from you! Tell us the steps you take to ensure that your data is trustworthy.
Choose Language Hide Translation Bar
Highlighted
vince_faller
Super User

Get list of files modified after a certain date without a for loop

I'm trying to find all the files in a directory modified after a specific date.  The directory is a network drive with a million files in it.  So if I try to simply do a for loop with LastModificationDate(), it takes roughly six hours.  Anyone every run into something like this? 

Vince Faller - Predictum
1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted
Craige_Hales
Staff (Retired)

Re: Get list of files modified after a certain date without a for loop

See if this helps:

// requires JMP 14

// on a virtual machine, the network drive F will process
// around 50K files per minute this way

mfi=Multiple File Import(
	<<Set Folder( "F:\MandelbrotTrace\pics" ),
	<<Set Name Filter( "*.jpg;" ),
	<<Set Name Enable( 1 ),
	<<Set Size Filter( {50000,100000} ),
	<<Set Size Enable( 1 ),
	<<Set Date Filter( {2019-01-01, 2019-01-03} ),
	<<Set Date Enable( 1 )
); // note: no import is performed with the <<import method.

// the 3 filters can be respecified for nearly free
// as long as the folder isn't changed and mfi object
// isn't closed.

mfi<<Set Size Filter( {50000,300000} );
mfi<<Set Date Filter( {informat("2019-01-02T08:41:32"), informat("2019-01-02T08:43:52")} );

// make a table of selected files

dtFiltered = mfi<<showSelection();

// make a table of rejected files 

dtUnFiltered = mfi<<showRejection();

50K/minute for this configuration of a network50K/minute for this configuration of a network

Columns include filter valuesColumns include filter values 

 

Craige

View solution in original post

6 REPLIES 6
Highlighted
uday_guntupalli
Level VIII

Re: Get list of files modified after a certain date without a for loop

@vince_faller,
     Have you tried using Last Modification Date() in column formula in a private data table ? 

Best
Uday
Highlighted
vince_faller
Super User

Re: Get list of files modified after a certain date without a for loop

It's not the for loop overhead. It's the network drive ping. Every call to it seems to take more than a second.
Vince Faller - Predictum
Highlighted
ron_horne
Super User

Re: Get list of files modified after a certain date without a for loop

Hi @vince_faller

 

perhaps you can try using the Files In Directory ()function as in the first section of the script by @Craige_Hales in this link: https://community.jmp.com/t5/Uncharted/Files-In-Directory/ba-p/21232

 

alternatively, consider linking to R and using the following:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/list.files.html

 

Is this is in the right direction?

 

Ron

Highlighted
Craige_Hales
Staff (Retired)

Re: Get list of files modified after a certain date without a for loop

See if this helps:

// requires JMP 14

// on a virtual machine, the network drive F will process
// around 50K files per minute this way

mfi=Multiple File Import(
	<<Set Folder( "F:\MandelbrotTrace\pics" ),
	<<Set Name Filter( "*.jpg;" ),
	<<Set Name Enable( 1 ),
	<<Set Size Filter( {50000,100000} ),
	<<Set Size Enable( 1 ),
	<<Set Date Filter( {2019-01-01, 2019-01-03} ),
	<<Set Date Enable( 1 )
); // note: no import is performed with the <<import method.

// the 3 filters can be respecified for nearly free
// as long as the folder isn't changed and mfi object
// isn't closed.

mfi<<Set Size Filter( {50000,300000} );
mfi<<Set Date Filter( {informat("2019-01-02T08:41:32"), informat("2019-01-02T08:43:52")} );

// make a table of selected files

dtFiltered = mfi<<showSelection();

// make a table of rejected files 

dtUnFiltered = mfi<<showRejection();

50K/minute for this configuration of a network50K/minute for this configuration of a network

Columns include filter valuesColumns include filter values 

 

Craige

View solution in original post

Craige_Hales
Staff (Retired)

Re: Get list of files modified after a certain date without a for loop

If MFI doesn't do what you need, open a cmd.exe window and see how long the DIR command takes. If it is too slow, there isn't much else I know to do, but it is possible it will be a lot faster. In that case, there is a scripting index example under RunProgram that will get you going (2nd example, I think). 

DIR has a lot of arguments that affect formatting and results.

Craige
Highlighted
vince_faller
Super User

Re: Get list of files modified after a certain date without a for loop

Right now we're just running a simple batch file but that has its own problems. I'll try the MFI, it looks promising.
Vince Faller - Predictum
Article Labels

    There are no labels assigned to this post.