Subscribe Bookmark RSS Feed

Import data from html page issue

j_bonnouvrier

Community Trekker

Joined:

Dec 19, 2012

Hi,

 

I am trying to import the data from the following website:

http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html

It is basically a record of meteorological data.

 

I am using the JSL command:

 

But it fails. Does someone have the solution?

Open("http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html");

Thanks and regards,

 

Jérôme

 

14 REPLIES
txnelson

Super User

Joined:

Jun 22, 2012

I attempted to interactively open the webpage using

     File==>Internet Open

and got the following error message

     This web page uses an unsupported character set: "iso-8859-15".

I would contact JMP directly about when such support might be available

Jim
M_Anderson

Staff

Joined:

Nov 21, 2014

Have you tried to use File > Internet Open...?  In the dialog, you can then select to have the HTML open as data.  You should get two data tables.   Look at the "Source" table in the table script area for the one you want to see what the JSL looks like to get to it directly.

 

Best,

 

M

M_Anderson

Staff

Joined:

Nov 21, 2014

@txnelson, that's odd.  I just did it through the GUI and it worked fine.

JSL for the first table:

Open(
	"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
	HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) )
)

JSL for the second table:

Open(
	"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
	HTML Table( 2, Column Names( 1 ), Data Starts( 2 ) )
)

 

There do appear to be a lot of pictures in the data table.

M_Anderson

Staff

Joined:

Nov 21, 2014

@txnelson, that's odd.  I just did it through the GUI and it worked fine.

JSL for the first table:

 

Open(
	"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
	HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) )
)

 

JSL for the second table:

 

Open(
	"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
	HTML Table( 2, Column Names( 1 ), Data Starts( 2 ) )
)

 

 

There do appear to be a lot of pictures in the data table.

txnelson

Super User

Joined:

Jun 22, 2012

Here is the log I get when I run your little piece of JSL to get to the first table

TEConverter - unrecognized charset in access or evaluation of 'Open' , Open/*###*/(
"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) ))

In the following script, error marked by /*###*/
Open/*###*/(
"http://www.infoclimat.fr/observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html",
HTML Table( 1, Column Names( 0 ), Data Starts( 1 ) ))
Jim
M_Anderson

Staff

Joined:

Nov 21, 2014

@txnelson - it appears to be a Mac vs Windows issue... Runs fine on my MacOS box, but I get the same error you get on my windows partition.  

 

@j_bonnouvrier - you probably want to report this to tech support, like @txnelson indicated originally. 

j_bonnouvrier

Community Trekker

Joined:

Dec 19, 2012

Thanks, that's what I will do!

 

Craige_Hales

Staff

Joined:

Mar 21, 2013

It looks like JMP on Windows does not handle IEC_8859-15 very well. Here's a work-around (the sockets will not work for https, only for http) if you need it. Mostly taken from the re-written scripting example in JMP 13.1. At the end, the page is imported with iso-8859-1; it appears to be close enough; the wikipedia page shows eight characters that might be problems.

// <<connect is used to connect to a remote computer with an open listening socket.
// some web sites require www, some don't like it.  Some require the HTTP/1.1 format, some are happy with HTTP/1.0 in the GET.
// You'll want to make the error handling more robust...
skt = Socket();
rc = skt << connect( "www.infoclimat.fr", "80" );
If( rc[2] != "ok",
	Show( rc );
	Stop();
); 

// request a resource from the remote computer.  the trailing slash in the example might or might not be needed.
skt << Send(
	Char To Blob(
		"GET /observations-meteo/archives/24/janvier/2017/grenoble-st-geoirs/07486.html HTTP/1.1~0d~0aHost: www.infoclimat.fr~0d~0aConnection: Close~0d~0a~0d~0a",
		"ascii~hex"
	)
);
// gather the response, it might be more than one buffer
blob = Char To Blob( "" ); // start with nothing
blobtext = "";
timeout = 50; // give remote a short time to send a response.  you might need to tune the timeout behavior.
While( timeout-- > 0,
	rc = skt << recv( 10000 );
	If(
		rc[2] == "ok",
			blob = blob || rc[3]; // recv always returns a blob, not text
			Write( "\!nsome text received" ); // typically about three of these, ymmv
			timeout = 50; // reset, still receiving ok
	, // else
		Starts With( rc[2], "CLOSED" ),
			Break(); // done
	, // else
		Starts With( rc[2], "WOULDBLOCK" ), //
			Write( "\!nwaiting..." ) // still fetching, maybe
	, // else...what?
		Show( rc );
		Stop(); // that was unexpected
	);
	blobtext = Blob To Char( blob, encoding = "utf-8" );
	If( timeout == 0,
		Write( "\!nTimed out waiting for remote request" );
		Stop();
	);
	Wait( .05 ); // give the OS some cycles to work with the incoming data
);

Show( Length( blob ) );

// JMP does not understand what the page claims: iso-8859-15
txt = blobToChar(blob,encoding="iso-8859-1"); length(txt);

f = saveTextFile("$temp/deleteme.html",txt);

open(f,htmltable(2))

Imported tableImported table

 

Craige
txnelson

Super User

Joined:

Jun 22, 2012

Great response Craige

 

A document needs to be created for this one

Jim