cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Browse apps to extend the software in the new JMP Marketplace
Choose Language Hide Translation Bar
vince_faller
Super User (Alumni)

Get list of files modified after a certain date without a for loop

I'm trying to find all the files in a directory modified after a specific date.  The directory is a network drive with a million files in it.  So if I try to simply do a for loop with LastModificationDate(), it takes roughly six hours.  Anyone every run into something like this? 

Vince Faller - Predictum
1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: Get list of files modified after a certain date without a for loop

See if this helps:

// requires JMP 14

// on a virtual machine, the network drive F will process
// around 50K files per minute this way

mfi=Multiple File Import(
	<<Set Folder( "F:\MandelbrotTrace\pics" ),
	<<Set Name Filter( "*.jpg;" ),
	<<Set Name Enable( 1 ),
	<<Set Size Filter( {50000,100000} ),
	<<Set Size Enable( 1 ),
	<<Set Date Filter( {2019-01-01, 2019-01-03} ),
	<<Set Date Enable( 1 )
); // note: no import is performed with the <<import method.

// the 3 filters can be respecified for nearly free
// as long as the folder isn't changed and mfi object
// isn't closed.

mfi<<Set Size Filter( {50000,300000} );
mfi<<Set Date Filter( {informat("2019-01-02T08:41:32"), informat("2019-01-02T08:43:52")} );

// make a table of selected files

dtFiltered = mfi<<showSelection();

// make a table of rejected files 

dtUnFiltered = mfi<<showRejection();

50K/minute for this configuration of a network50K/minute for this configuration of a network

Columns include filter valuesColumns include filter values 

 

Craige

View solution in original post

6 REPLIES 6
uday_guntupalli
Level VIII

Re: Get list of files modified after a certain date without a for loop

@vince_faller,
     Have you tried using Last Modification Date() in column formula in a private data table ? 

Best
Uday
vince_faller
Super User (Alumni)

Re: Get list of files modified after a certain date without a for loop

It's not the for loop overhead. It's the network drive ping. Every call to it seems to take more than a second.
Vince Faller - Predictum
ron_horne
Super User (Alumni)

Re: Get list of files modified after a certain date without a for loop

Hi @vince_faller

 

perhaps you can try using the Files In Directory ()function as in the first section of the script by @Craige_Hales in this link: https://community.jmp.com/t5/Uncharted/Files-In-Directory/ba-p/21232

 

alternatively, consider linking to R and using the following:

https://stat.ethz.ch/R-manual/R-devel/library/base/html/list.files.html

 

Is this is in the right direction?

 

Ron

Craige_Hales
Super User

Re: Get list of files modified after a certain date without a for loop

See if this helps:

// requires JMP 14

// on a virtual machine, the network drive F will process
// around 50K files per minute this way

mfi=Multiple File Import(
	<<Set Folder( "F:\MandelbrotTrace\pics" ),
	<<Set Name Filter( "*.jpg;" ),
	<<Set Name Enable( 1 ),
	<<Set Size Filter( {50000,100000} ),
	<<Set Size Enable( 1 ),
	<<Set Date Filter( {2019-01-01, 2019-01-03} ),
	<<Set Date Enable( 1 )
); // note: no import is performed with the <<import method.

// the 3 filters can be respecified for nearly free
// as long as the folder isn't changed and mfi object
// isn't closed.

mfi<<Set Size Filter( {50000,300000} );
mfi<<Set Date Filter( {informat("2019-01-02T08:41:32"), informat("2019-01-02T08:43:52")} );

// make a table of selected files

dtFiltered = mfi<<showSelection();

// make a table of rejected files 

dtUnFiltered = mfi<<showRejection();

50K/minute for this configuration of a network50K/minute for this configuration of a network

Columns include filter valuesColumns include filter values 

 

Craige
Craige_Hales
Super User

Re: Get list of files modified after a certain date without a for loop

If MFI doesn't do what you need, open a cmd.exe window and see how long the DIR command takes. If it is too slow, there isn't much else I know to do, but it is possible it will be a lot faster. In that case, there is a scripting index example under RunProgram that will get you going (2nd example, I think). 

DIR has a lot of arguments that affect formatting and results.

Craige
vince_faller
Super User (Alumni)

Re: Get list of files modified after a certain date without a for loop

Right now we're just running a simple batch file but that has its own problems. I'll try the MFI, it looks promising.
Vince Faller - Predictum