Ezio Melotti <ezio.melo...@gmail.com> added the comment: I test this again and indeed a bare s.decode() is not enough to fix the problem. The attribute might contain non-ascii characters, and that will result in an error (see for example the "test.py" script attached to #3932). The correct solution is to decode the page before passing it to the parser.
---------- resolution: -> duplicate stage: -> committed/rejected status: open -> closed superseder: -> HTMLParser cannot handle '&' and non-ascii characters in attribute names versions: -Python 3.2 _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14251> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com