Siemen

Stef should have added that XPath depends on using Monty's XMLParser suite. I 
tried your snippet on XMLDOMParser, and it parses correctly. I always use 
XMLHTMLParser for parsing HTML, because I can always see the exact relationship 
between the parsed structure and the original HTML. With Soup I often found the 
match difficult or even impossible.

HTH

Peter Kenny

-----Original Message-----
From: Pharo-users [mailto:pharo-users-boun...@lists.pharo.org] On Behalf Of 
Stephane Ducasse
Sent: 08 November 2017 21:19
To: Any question about pharo is welcome <pharo-users@lists.pharo.org>
Subject: Re: [Pharo-users] Soup bug(fix)

Hi Siemen

let me know your loging and I can add you to commit. Paul is also taking care 
of Soup.
Now I like XPath for scraping. Did you see the tutorial I wrote with Peter.


STef

On Wed, Nov 8, 2017 at 2:17 PM, Siemen Baader <siemenbaa...@gmail.com> wrote:
> Hi all,
>
> who maintains Soup, the HTML parser? Stef?
>
> It seems to auto-close <button> (and <a>) tags when nested inside 
> another element. I wrote this test that fails:
>
> testNestedButton
>     "this works with nested <div> tags instead of <button> and when 
> there is no enclosing <div> at all. but here <button> is auto-closed."
>
>     "a does not work either"
>
>     | soup |
>     soup := Soup
>         fromString:
>             '<div><button>
>         <span>text</span>
>    </button>
> </div>'.
>     self assert: soup div button span string equals: 'text'
>
> ----
>
>
> Where should I look to prevent Soup from auto-closing the tag, and 
> where & how should I submit my fix?
>
> cheers,
> Siemen


Reply via email to