cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Choose Language Hide Translation Bar
dadawasozo
Level IV

read and save a txt file from a subdirectory of a ZIP file

Hi,

Can someone help. I tried many and still fail.

I tried to loop through a list of zip folder to try to find specific txt file that is under a subdirectory of the zip folder. when I get the filelist  in the zip folder, I saw the txt file as subdirector/mytxtfile_123.log. Below is my script, but it fails to download the txt file. there is IO problem where it complains "system cannot find the path specified". I wonder if the flipped of "\" to "/" when I get the filelist. Is there a neat way to get the file in subdirectory or a zip ?

path == path where all zip folder located;
path1 == path where txt file going to save;
lst = {the list of the zip folders};

for (k=1, k <=N items(lst), k++,
	Try(
    za = open(path||lst[k], zip);
    dirlist = za << dir;
       
    for (l=1, l<=N Items(dirlist), l++,
		If (Contains(dirlist[l], "mytxtfile_"), 
			text = za << read(dirlist[l]);
			save text file(path1||lst[k]||"_"||dirlist[l],text);
			
			);
   
		)
		,
    print (lst[k] || "fail to open");
    Throw();
		);
 );
1 ACCEPTED SOLUTION

Accepted Solutions
Craige_Hales
Super User

Re: read and save a txt file from a subdirectory of a ZIP file

The thing to know about zip files: they do not actually have a nested hierarchical directory structure. A zip file's directory is just a simple list of character string names. Those names might contain / or \ characters, and some zip file viewers might try to represent that as a hierarchical nest. So, the zip file member that has a name that contains mytextfile also contains / characters in the name, and when this path is constructed

save text file(path1  ||  lst[k]  ||  "_"  ||  dirlist[l]  , text);

dirlist contains "test/test/mytextfile.txt" and the resulting pathname from the concatenation probably does not exist on your computer.

 

The word() or words() functions might be useful for separating the parts of the path.

 

Craige

View solution in original post

3 REPLIES 3
dadawasozo
Level IV

Re: read and save a txt file from a subdirectory of a ZIP file

I added an example of a zip folder here as example. Below is the error message
dirlist = {"test/12.txt", "test/hello.txt", "test/me/", "test/me/testing.txt", "test/test/", "test/test/mytxtfile.txt"};
IO problem: "mytxtfile" Unable to open in ReadWrite mode.
The system cannot find the path specified.

jthi
Super User

Re: read and save a txt file from a subdirectory of a ZIP file

I didn't (yet) check where your script goes wrong, but there is a version which should save the txt file (requires JMP16+ due to For Each() and Filter Each())

Names Default To Here(1);

file_of_interest = "mytxtfile.txt";
zip_path = "$DOWNLOADS\test.zip";
save_path = "$DOWNLOADS\586819.txt";

za = Open(zip_path, "zip");
members_in_zip = za << dir;

// take only .txt files
file_paths_of_interest = Filter Each({members}, members_in_zip,
	Ends With(members, file_of_interest);
);

If(N Items(file_paths_of_interest) == 0,
	Throw("Not found");
);

// Assume that only one of those files exist
file_path_of_interest = file_paths_of_interest[1];

data = za << read(file_path_of_interest, Format("string"));
Save Text File(save_path, data);

//Open(save_path);

jthi_0-1672905047955.png

 

-Jarmo
Craige_Hales
Super User

Re: read and save a txt file from a subdirectory of a ZIP file

The thing to know about zip files: they do not actually have a nested hierarchical directory structure. A zip file's directory is just a simple list of character string names. Those names might contain / or \ characters, and some zip file viewers might try to represent that as a hierarchical nest. So, the zip file member that has a name that contains mytextfile also contains / characters in the name, and when this path is constructed

save text file(path1  ||  lst[k]  ||  "_"  ||  dirlist[l]  , text);

dirlist contains "test/test/mytextfile.txt" and the resulting pathname from the concatenation probably does not exist on your computer.

 

The word() or words() functions might be useful for separating the parts of the path.

 

Craige