- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Import data from html page issue
Hi,
I am trying to import the data from the following website:
http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html
It is basically a record of meteorological data.
I am using the JSL command:
But it fails. Does someone have the solution?
Open("http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html");
Thanks and regards,
Jérôme
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
I attempted to interactively open the webpage using
File==>Internet Open
and got the following error message
This web page uses an unsupported character set: "iso-8859-15".
I would contact JMP directly about when such support might be available
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
Have you tried to use File > Internet Open...? In the dialog, you can then select to have the HTML open as data. You should get two data tables. Look at the "Source" table in the table script area for the one you want to see what the JSL looks like to get to it directly.
Best,
M
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
@txnelson, that's odd. I just did it through the GUI and it worked fine.
JSL for the first table:
Open(
"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) )
)
JSL for the second table:
Open(
"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
HTML Table( 2, Column Names( 1 ), Data Starts( 2 ) )
)
There do appear to be a lot of pictures in the data table.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
@txnelson, that's odd. I just did it through the GUI and it worked fine.
JSL for the first table:
Open(
"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) )
)
JSL for the second table:
Open(
"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
HTML Table( 2, Column Names( 1 ), Data Starts( 2 ) )
)
There do appear to be a lot of pictures in the data table.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
Here is the log I get when I run your little piece of JSL to get to the first table
TEConverter - unrecognized charset in access or evaluation of 'Open' , Open/*###*/( "http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html", HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) )) In the following script, error marked by /*###*/ Open/*###*/( "http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html", HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) ))
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
@txnelson - it appears to be a Mac vs Windows issue... Runs fine on my MacOS box, but I get the same error you get on my windows partition.
@j_bonnouvrier - you probably want to report this to tech support, like @txnelson indicated originally.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
Thanks, that's what I will do!
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
It looks like JMP on Windows does not handle IEC_8859-15 very well. Here's a work-around (the sockets will not work for https, only for http) if you need it. Mostly taken from the re-written scripting example in JMP 13.1. At the end, the page is imported with iso-8859-1; it appears to be close enough; the wikipedia page shows eight characters that might be problems.
// <<connect is used to connect to a remote computer with an open listening socket.
// some web sites require www, some don't like it. Some require the HTTP/1.1 format, some are happy with HTTP/1.0 in the GET.
// You'll want to make the error handling more robust...
skt = Socket();
rc = skt << connect( "www.infoclimat.fr", "80" );
If( rc[2] != "ok",
Show( rc );
Stop();
);
// request a resource from the remote computer. the trailing slash in the example might or might not be needed.
skt << Send(
Char To Blob(
"GET /observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html HTTP/1.1~0d~0aHost: www.infoclimat.fr~0d~0aConnection: Close~0d~0a~0d~0a",
"ascii~hex"
)
);
// gather the response, it might be more than one buffer
blob = Char To Blob( "" ); // start with nothing
blobtext = "";
timeout = 50; // give remote a short time to send a response. you might need to tune the timeout behavior.
While( timeout-- > 0,
rc = skt << recv( 10000 );
If(
rc[2] == "ok",
blob = blob || rc[3]; // recv always returns a blob, not text
Write( "\!nsome text received" ); // typically about three of these, ymmv
timeout = 50; // reset, still receiving ok
, // else
Starts With( rc[2], "CLOSED" ),
Break(); // done
, // else
Starts With( rc[2], "WOULDBLOCK" ), //
Write( "\!nwaiting..." ) // still fetching, maybe
, // else...what?
Show( rc );
Stop(); // that was unexpected
);
blobtext = Blob To Char( blob, encoding = "utf-8" );
If( timeout == 0,
Write( "\!nTimed out waiting for remote request" );
Stop();
);
Wait( .05 ); // give the OS some cycles to work with the incoming data
);
Show( Length( blob ) );
// JMP does not understand what the page claims: iso-8859-15
txt = blobToChar(blob,encoding="iso-8859-1"); length(txt);
f = saveTextFile("$temp/deleteme.html",txt);
open(f,htmltable(2))
Imported table
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Get Direct Link
- Report Inappropriate Content
Re: Import data from html page issue
Great response Craige
A document needs to be created for this one