John, thanks for your extensive answer. >> Hi, >> I am using Python 2.4.3 on Fedora Core4 and "Eric3" Python IDE >> . >> Below mentioned code works fine in the Eric3 environment. While trying >> to start it from the command line, it returns: >> >> Traceback (most recent call last): >> File "pokus_1.py", line 5, in ? >> print str(a) >> UnicodeEncodeError: 'ascii' codec can't encode character u'\xc1' in >> position 6: ordinal not in range(128)
JM> So print a works, but print str(a) crashes. JM> Instead, insert this: JM> import sys JM> print "default", sys.getdefaultencoding() JM> print "stdout", sys.stdout.encoding JM> and run your script at the command line. It should print: JM> default ascii JM> stdout x **** in the command line it prints: ***** default ascii stdout UTF-8 JM> here, and crash at the later use of str(a). JM> Step 2: run your script under Eric3. It will print: JM> default y JM> stdout z **** in the Eric3 it prints: **** if the # -*- Eencoding: utf_8 -*- is set than: default utf_8 stdout unhandled AttributeError, "AsyncFile instance has no attribute 'encoding' " if the encoding is not set than it prints: DeprecationWarning: Non-ASCII character '\xc3' in file /root/eric/analyza_dat_TPC/pokus_1.py on line 26, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details execfile(sys.argv[0], self.debugMod.__dict__) default latin-1 stdout unhandled AttributeError, "AsyncFile instance has no attribute 'encoding' " JM> and then should work properly. It is probable that x == y == z == JM> 'utf-8' JM> Step 3: see below. >> >> ========== 8< ============= >> #!/usr/bin python >> # -*- Encoding: utf_8 -*- JM> There is no UTF8-encoded text in this short test script. Is the above JM> encoding comment merely a carry-over from your real script, or do you JM> believe it is necessary or useful in this test script? Generally, I am working with string like u'DISKOV\xc1 POLE' (I am getting it from the database) My intention to use >> # -*- Encoding: utf_8 -*- was to suppress DeprecationWarnings if I use utf_8 in the code (like u'DISKOV\xc1 POLE') >> >> a= u'DISKOV\xc1 POLE' >> print a >> print str(a) >> ========== 8< ============= >> >> Even it looks strange, I have to use str(a) syntax even I know the "a" >> variable is a string. JM> Some concepts you need to understand: JM> (a) "a" is not a string, it is a reference to a string. JM> (b) It is a reference to a unicode object (an implementation of a JM> conceptual Unicode string) ... JM> (c) which must be distinguished from a str object, which represents a JM> conceptual string of bytes. JM> (d) str(a) is trying to produce a str object from a unicode object. Not JM> being told what encoding to use, it uses the default encoding JM> (typically ascii) and naturally this will crash if there are non-ascii JM> characters in the unicode object. >> I am trying to use ChartDirector for Python (charts for Python) and the >> method "layer.addDataSet()" needs above mentioned syntax otherwise it >> returns an Error. JM> Care to tell us which error??? you can see the Error description and author comments here: http://tinyurl.com/ezohe >> >> layer.addDataSet(data, colour, str(dataName)) I have try to experiment with the code a bit. the simplest code where I can demonstrate my problems: #!/usr/bin python import sys print "default", sys.getdefaultencoding() print "stdout", sys.stdout.encoding a=['P\xc5\x99\xc3\xad','Petr Jake\xc5\xa1'] b="my nice try %s" % ''.join(a).encode("utf-8") print b When I run it from the command line i am getting: sys:1: DeprecationWarning: Non-ASCII character '\xc3' in file pokus_1.py on line 26, but no encoding declared; see http://www.python.org/peps/pep-0263.html for details default ascii stdout UTF-8 Traceback (most recent call last): File "pokus_1.py", line 8, in ? b="my nice try %s" % ''.join(a).encode("utf-8") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc5 in position 1: ordinal not in range(128) JM> The method presumably expects a str object (8-bit string). What does JM> its documentation say? Again, what error message do you get if you feed JM> it a unicode object with non-ascii characters? JM> [Step 3] For foo in set(['x', 'y', 'z']): JM> Change str(dataName) to dataName.encode(foo). Change any debugging JM> display to use repr(a) instead of str(a). Test it with both Eric3 and JM> the command line. JM> [Aside: it's entirely possible that your problem will go away if you JM> remove the letter u from the line a= u'DISKOV\xc1 POLE' -- however if JM> you want to understand what is happening generally, I suggest you don't JM> do that] JM> HTH, JM> John -- http://mail.python.org/mailman/listinfo/python-list