<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Pull ZIP Files from HTTP Link in Discussions</title>
    <link>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256506#M50391</link>
    <description>&lt;P&gt;Thanks Craig for the comprehensive reply. I'll take elements of both suggestions and merge into a generic function to suit my current and future needs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I know I'd have to use a REGEXP to find the links from the HTTP Source but was hoping for a '&amp;lt;&amp;lt; saveLink' function for the zip part. :)&lt;/img&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This will work perfectly fine though.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Great answer(s)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 07 Apr 2020 09:20:50 GMT</pubDate>
    <dc:creator>thickey1</dc:creator>
    <dc:date>2020-04-07T09:20:50Z</dc:date>
    <item>
      <title>Pull ZIP Files from HTTP Link</title>
      <link>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256290#M50350</link>
      <description>&lt;P&gt;I have published ZIP files and want to programmatically pull them and store to my PC using JSL. I don't know up front how many files will be present.&lt;/P&gt;&lt;P&gt;Is this possible with JSL?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="zip.png" style="width: 760px;"&gt;&lt;img src="https://community.jmp.com/t5/image/serverpage/image-id/22799i41FB46221876451A/image-size/large?v=v2&amp;amp;px=999" role="button" title="zip.png" alt="zip.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Apr 2020 12:13:42 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256290#M50350</guid>
      <dc:creator>thickey1</dc:creator>
      <dc:date>2020-04-06T12:13:42Z</dc:date>
    </item>
    <item>
      <title>Re: Pull ZIP Files from HTTP Link</title>
      <link>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256336#M50357</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;zip file from&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/4552"&gt;@wilkap&lt;/a&gt;&amp;nbsp;&lt;A href="https://community.jmp.com/t5/Virtual-JMP-Users-Group/VJUG-July-2015-zip/gpm-p/22641" target="_blank" rel="noopener"&gt;presentation&lt;/A&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;za=open("https://community.jmp.com/kvoqx44227/attachments/kvoqx44227/virtual-jug/12/1/VJUG%20July%202015.zip","zip");
zipfiles=za&amp;lt;&amp;lt;dir;
show(zipfiles);
blob=za&amp;lt;&amp;lt;read(zipfiles[4],format(blob));
dt=open(blob,jmp);
clearglobals(za);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;several things to note&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;the file is downloaded to your temp directory; the "zip" option to open returns a zip archive object&lt;/LI&gt;&lt;LI&gt;you can get a list of members from the zip archive using &amp;lt;&amp;lt;dir&lt;/LI&gt;&lt;LI&gt;you can use the blob format with zip archive for reading binary data like JMP tables&lt;/LI&gt;&lt;LI&gt;the 3rd line uses a 2nd argument to tell open that the blob is a JMP data table&lt;/LI&gt;&lt;LI&gt;clearing the za variable is needed if you rerun the whole script; the zip archive object keeps the file in the temp directory from being reused.&lt;/LI&gt;&lt;LI&gt;you could use loadtextfile/savetextfile with blobs to download the zip file to a location of your choice (and delete it when done) and then use the zip archive to process that file.&lt;/LI&gt;&lt;LI&gt;I already looked to see the 4th item in the archive directory was a JMP data table&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 06 Apr 2020 14:44:43 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256336#M50357</guid>
      <dc:creator>Craige_Hales</dc:creator>
      <dc:date>2020-04-06T14:44:43Z</dc:date>
    </item>
    <item>
      <title>Re: Pull ZIP Files from HTTP Link</title>
      <link>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256347#M50362</link>
      <description>&lt;P&gt;Or maybe this is closer to what you are asking:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-jsl"&gt;path="https://www.vsp.virginia.gov/downloads/"; // a page with an index of files. Yours may be different format, adjust pattern below.
html = loadtextfile(path); // get the HTML text so we can scrape the links
// somewhat custom pattern for scraping the links, may be specific to this page
urls = {}; // this list will collect the urls 
rc = patmatch(html,
	patpos(0)+ // make sure the pattern matches from the start
	patrepeat( // this is the loop that extracts the urls from the html
		(
			// the urls look like &amp;lt;a href="2017%20Virginia%20Firearms%20Dealers%20Procedrures%20Manual.pdf"&amp;gt;
			// and we want just the part between the quotation marks. Quickly scan forward (patBreak)
			// for a &amp;lt; then see if it matches. &amp;gt;&amp;gt;url grabs the text between quotation marks.
			(patbreak("&amp;lt;") + "&amp;lt;a href=\!"" + patbreak("\!"") &amp;gt;&amp;gt; url + pattest(insertinto(urls,url);1))
			| // OR
			patlen(1) // skip forward one character
		)
		+
		patfence() // fence off the successfully matched text. There is no need to backtrack if something goes wrong.
	) + 
	patrepeat(patnotany("&amp;lt;"),0) + // any trailing bits of html are consumed here
	patrpos(0) // make sure the pattern matches to the end
);

if(rc==0, throw("pattern did not match everything"));
show(nitems(urls),urls[6]); // pick item 6. You'll have a different strategy.

fullpath = path||regex(urls[6],"%20"," ",GLOBALREPLACE);// minimal effort to fix up the url, might need more work

pdfblob = loadtextfile(fullpath,blob); // download item 6, it is a pdf when this was written...
savetextfile("$temp/example.pdf",pdfblob); // save it somewhere&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 06 Apr 2020 16:52:16 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256347#M50362</guid>
      <dc:creator>Craige_Hales</dc:creator>
      <dc:date>2020-04-06T16:52:16Z</dc:date>
    </item>
    <item>
      <title>Re: Pull ZIP Files from HTTP Link</title>
      <link>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256506#M50391</link>
      <description>&lt;P&gt;Thanks Craig for the comprehensive reply. I'll take elements of both suggestions and merge into a generic function to suit my current and future needs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I know I'd have to use a REGEXP to find the links from the HTTP Source but was hoping for a '&amp;lt;&amp;lt; saveLink' function for the zip part. :)&lt;/img&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This will work perfectly fine though.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Great answer(s)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Apr 2020 09:20:50 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256506#M50391</guid>
      <dc:creator>thickey1</dc:creator>
      <dc:date>2020-04-07T09:20:50Z</dc:date>
    </item>
    <item>
      <title>Re: Pull ZIP Files from HTTP Link</title>
      <link>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256509#M50393</link>
      <description>&lt;P&gt;Glad you can get something out of it! I'm pretty sure the pattern could be improved, speed-wise. Probably doesn't make a difference for directories of only a few thousand links, but the len(1) part could skip non-link text faster. And a more flexible pattern for the links would be better too.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/5036"&gt;@bryan_boone&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/6331"&gt;@ErnestPasour&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.jmp.com/t5/user/viewprofilepage/user-id/12269"&gt;@paul_vezzetti&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 07 Apr 2020 10:54:55 GMT</pubDate>
      <guid>https://community.jmp.com/t5/Discussions/Pull-ZIP-Files-from-HTTP-Link/m-p/256509#M50393</guid>
      <dc:creator>Craige_Hales</dc:creator>
      <dc:date>2020-04-07T10:54:55Z</dc:date>
    </item>
  </channel>
</rss>

