cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
hogi
Level XI

Multi file Import: Discovering Files takes too long

MFI is great - I really love it.

 

But sometimes the low speed of the file search hurts.
If the data comes from a network drive and I want to access the folder from home office, it can take > 10 minutes for Discovering Files to find my 4 matching file out of thousand of folders/subfolders/files ...

 

Then it's faster to open Python in parallel, write a short GLOB search and load the files one by one via for loop and open(filename). Much faster than waiting for MFI to populate the Files list with all the file which fail the filename match.

 

The example with Python/glob shows that the file search can be orders of magnitude faster - Is there any trick in Jmp/MFI to speed this up?

2 ACCEPTED SOLUTIONS

Accepted Solutions
Craige_Hales
Super User

Re: Multi file Import: Discovering Files takes too long

The glob search is probably faster because it does not retrieve date and size.

https://stackoverflow.com/questions/23430395/glob-search-files-in-date-order

There is no way to tell MFI to skip retrieving date and size (but it could be a wish.)

 

edit: this is much less noticeable on the machine's local file system; the network disk communication is probably 3X greater to get the name+date+size rather than just the name. Also, you might be able to use the recursive filesInDirectory, which should be similar to the glob search time, and pick the files you want from that full list with a pattern match. And be careful bench-marking; the 2nd time may be much faster if caching is involved--easily 100X if the communication time goes to zero.

(And change date+time to date+size.)

Craige

View solution in original post

hogi
Level XI

Re: Multi file Import: Discovering Files takes too long

I posted a wish to optimize the speed of MFI:
Multi File import: add a fast mode 

 

 

View solution in original post

2 REPLIES 2
Craige_Hales
Super User

Re: Multi file Import: Discovering Files takes too long

The glob search is probably faster because it does not retrieve date and size.

https://stackoverflow.com/questions/23430395/glob-search-files-in-date-order

There is no way to tell MFI to skip retrieving date and size (but it could be a wish.)

 

edit: this is much less noticeable on the machine's local file system; the network disk communication is probably 3X greater to get the name+date+size rather than just the name. Also, you might be able to use the recursive filesInDirectory, which should be similar to the glob search time, and pick the files you want from that full list with a pattern match. And be careful bench-marking; the 2nd time may be much faster if caching is involved--easily 100X if the communication time goes to zero.

(And change date+time to date+size.)

Craige
hogi
Level XI

Re: Multi file Import: Discovering Files takes too long

I posted a wish to optimize the speed of MFI:
Multi File import: add a fast mode