New submission from rednaks <salexandre...@gmail.com>: Hello ! while parsing a HTML code i got an decode Error :
but this issue can be fixed by replacing the last string by s.decode() like in the diff file. I also tried to execute my script under python3.2 and it does not parsing any thing File "/usr/lib/python2.7/HTMLParser.py", line 111, in feed self.goahead(0) File "/usr/lib/python2.7/HTMLParser.py", line 155, in goahead k = self.parse_starttag(i) File "/usr/lib/python2.7/HTMLParser.py", line 260, in parse_starttag attrvalue = self.unescape(attrvalue) File "/usr/lib/python2.7/HTMLParser.py", line 410, in unescape return re.sub(r"&(#?[xX]?(?:[0-9a-fA-F]+|\w{1,8}));", replaceEntities, s) File "/usr/lib/python2.7/re.py", line 151, in sub return _compile(pattern, flags).sub(repl, string, count) UnicodeDecodeError: 'ascii' codec can't decode byte 0x97 in position 1: ordinal not in range(128) ---------- components: Library (Lib) files: patch.txt messages: 155366 nosy: rednaks priority: normal severity: normal status: open title: [PATCH]HTMLParser decode issue type: crash versions: Python 2.7, Python 3.2 Added file: http://bugs.python.org/file24780/patch.txt _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue14251> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com