New submission from Ezio Melotti <ezio.melo...@gmail.com>: The attached patch fixes a few problems with HTMLParser on 2.7. Instead of raising error when invalid markup is detected, the parser now consumes the invalid input and proceeds. This patch is a partial backport of #1486713.
After this two more patches will follow. The first will get rid of errors raised while parsing declarations and should also solve #13576: def unknown_decl(self, data): - self.error("unknown declaration: %r" % (data,)) + pass The second will take care of "bogus comments" (see #13960). Once this is done HTMLParser should be able to parse (almost) everything. I'm planning to commit this before the release of 2.7.3. ---------- assignee: ezio.melotti components: Library (Lib) files: issue13987.diff keywords: patch messages: 153043 nosy: benjamin.peterson, eric.araujo, ezio.melotti, r.david.murray priority: normal severity: normal stage: patch review status: open title: Handling of broken markup in HTMLParser on 2.7 type: behavior versions: Python 2.7 Added file: http://bugs.python.org/file24475/issue13987.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue13987> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com