New submission from halfjuice <halfju...@gmail.com>:

When parsing html containing the following tag:
... <!- ie6 doesn't allow empty div. -> ...
SGMLParser will stop parse following content without any warning. When such tag 
is removed everything works fine.

When looking into sgmllib.py, statement below found:

    if rawdata.startswith("<!", i):
        # This is some sort of declaration; in "HTML as
        # deployed," this should only be the document type
        # declaration ("<!DOCTYPE html...>").

I think that's why something goes wrong here.

----------
components: Library (Lib)
messages: 118048
nosy: halfjuice
priority: normal
severity: normal
status: open
title: sgmllib fail to parse html containing <!- .... ->
type: behavior
versions: Python 2.6

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10035>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to