Martin Potthast <martin.potth...@googlemail.com> added the comment: Agreed. Here's a patch for HTMLParser. That was easy enough.
With regard to tests, there seems to be already one called test_malformatted_charref in test_htmlparser.py. However, the test tests the whole parser and not only HTMLParser.unescape(). At the same time, HTMLParser.unescape() has the following comment: "# Internal -- helper to remove special character quoting" It appears the syntax check is done in line 168 already, but since the unescape function is publicly visible, I'd say that it should be capable of handling all kinds of malformed input, despite that comment. Maybe this comment should be removed. I'm not entirely sure how to write the test properly, since it doesn't fit into the framework provided by test_htmlparser.py; and unfortunately, my time is rather short at the moment. ---------- keywords: +patch Added file: http://bugs.python.org/file20141/HTMLParser.py.diff _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10759> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com