alright, it's just because of Windows cmd in IDLE it works fine
any workaround? Dorian Le 13/04/2010 13:40, Dodo a écrit :
Here's a small script to generate again the error running windows 7 with python 3.1 FILE : parseShift.py import urllib.request as url from html.parser import HTMLParser class myParser(HTMLParser): def handle_starttag(self, tag, attrs): print("Start of %s tag : %s" % (tag, attrs)) test = myParser() handle = url.urlretrieve("http://localhost/shift.html") handleTemp = open( handle[0] , encoding="Shift-JIS" ) test.feed( handleTemp.read() ) handleTempl.close() FILE : shift.html (encoded Shift-JIS) <p class="thisisclass (not_in_japanese) reading_this_should_be_ok">Some random japanese <p><strong>東方プロジェクト</strong> <a href="#" title="キャプテン・ムラ サ">Link</a> OUTPUT Start of p tag : [('class', 'thisisclass (not_in_japanese) reading_this_should_be_ok')] Start of p tag : [] Start of strong tag : [] Traceback (most recent call last): File "D:\Dorian\Python\parseShift.py", line 12, in <module> test.feed( handleTemp.read() ) File "C:\Python31\lib\html\parser.py", line 108, in feed self.goahead(0) File "C:\Python31\lib\html\parser.py", line 148, in goahead k = self.parse_starttag(i) File "C:\Python31\lib\html\parser.py", line 268, in parse_starttag self.handle_starttag(tag, attrs) File "D:\Dorian\Python\parseShift.py", line 6, in handle_starttag print("Start of %s tag : %s" % (tag, attrs)) File "C:\Python31\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 44-52: c haracter maps to <undefined> any help? Dorian
-- http://mail.python.org/mailman/listinfo/python-list