i need to parse a HTML page to extract some info but i am getting an unexpected results, see the script below.
HTMLPageAsText = "
<html>
<head>
<title>String 1 i want to get</title>
</head>
<body>String 2 i want to get
<h2>Possibly also this</h2>
<h2>why i get only this</h2>
</body>
</html>
";
PageTitle="";
PageBody="";
Parse XML( HTMLPageAsText,
On Element( "title",
End Tag( PageTitle=XML text();show("Found title") )
),
On Element( "body",
End Tag( PageBody=XML text();show("Found body") )
),
);
show(PageTitle,PageBody);
what i Ideally need in the variable PageBody is:
PageBody="String 2 i want to get
<h2>Possibly also this</h2>
<h2>why i get only this</h2>"
or if it is only possible to get the content of the tag excluding the subtags i expect to get
PageBody="String 2 i want to get"
while Instead what i get is:
PageBody="why i get only this"
What am I doing wrong?