[issue10035] sgmllib fail to parse html containing

2010-10-06 Thread Georg Brandl
Georg Brandl added the comment: The browser needs to be very liberal in what it accepts, since nobody wants their page view to break because of such a technicality. This is different for a tool like SGMLParser. In light of this, and because sgmllib is removed anyway in Python 3, I'm closing

[issue10035] sgmllib fail to parse html containing

2010-10-06 Thread halfjuice
halfjuice added the comment: Sorry, the URL on the page is sort of broken. The URL contains the "" stuff. I think you're right, the ___ ___ Python-bugs-list mailing list Unsubsc

[issue10035] sgmllib fail to parse html containing

2010-10-05 Thread Georg Brandl
Georg Brandl added the comment: Is that URL really what you wanted to show me? Also, I'm not intimate with all of SGML's syntax, but ISTM that what you show here is invalid SGML, and as such SGMLParser is not required to parse it. -- ___ Python tra

[issue10035] sgmllib fail to parse html containing

2010-10-05 Thread halfjuice
halfjuice added the comment: well,

[issue10035] sgmllib fail to parse html containing

2010-10-05 Thread Georg Brandl
Georg Brandl added the comment: Are you sure you got the comment syntax right? e.g. SGMLParser should handle that. -- nosy: +georg.brandl resolution: -> works for me status: open -> pending ___ Python tracker _

[issue10035] sgmllib fail to parse html containing

2010-10-05 Thread halfjuice
New submission from halfjuice : When parsing html containing the following tag: ... ... SGMLParser will stop parse following content without any warning. When such tag is removed everything works fine. When looking into sgmllib.py, statement below found: if rawdata.startswith(""). I thin