Martin Potthast <martin.potth...@googlemail.com> added the comment:

Agreed. Here's a patch for HTMLParser. That was easy enough.

With regard to tests, there seems to be already one called 
test_malformatted_charref in test_htmlparser.py. However, the test tests the 
whole parser and not only HTMLParser.unescape().

At the same time, HTMLParser.unescape() has the following comment:
"# Internal -- helper to remove special character quoting"

It appears the syntax check is done in line 168 already, but since the unescape 
function is publicly visible, I'd say that it should be capable of handling all 
kinds of malformed input, despite that comment. Maybe this comment should be 
removed.

I'm not entirely sure how to write the test properly, since it doesn't fit into 
the framework provided by test_htmlparser.py; and unfortunately, my time is 
rather short at the moment.

----------
keywords: +patch
Added file: http://bugs.python.org/file20141/HTMLParser.py.diff

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10759>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to