On 25Jan2014 04:37, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > I have an unexpected display error when dealing with Unicode strings, and > I cannot understand where the error is occurring. I suspect it's not > actually a Python issue, but I thought I'd ask here to start. > > Using Python 3.3, if I print a unicode string from the command line, it > displays correctly. I'm using the KDE 3.5 Konsole application, with the > encoding set to the default (which ought to be UTF-8, I believe, although > I'm not completely sure).
There are at least 2 layers: the encoding python is using for transcription to the terminal and the decoding the terminal is making of the byte stream to decide what to display. The former can be printed with: import sys print(sys.stdout.encoding) The latter depends on your desktop settings and KDE settings I guess. I would hope the Konsole will decide based on your environment settings. Running the shell command: locale will print the settings derived from that. Provided your environment matches that which invoked the Konsole, that should be informative. But I expect the Konsole is decoding using UTF-8 because so much else works for you already. I would point out that you could perhaps debug with something like this: python2.7 ..... | od -c which will print the output bytes. By printing to the terminal, you're letting the terminal's decoding get in your way. It is fine for seeing correct/incorrect results, but not so fine for seeing the bytes causing them. > This displays correctly: > [steve@ando ~]$ python3.3 -c "print(u'ñøλπйж')" > ñøλπйж > > > Likewise for Python 3.2: > [steve@ando ~]$ python3.2 -c "print('ñøλπйж')" > ñøλπйж > > But using Python 2.7, I get a really bad case of moji-bake: > [steve@ando ~]$ python2.7 -c "print u'ñøλπйж'" > ñøλÏйж > > However, interactively it works fine: [...] Debug by printing sys.stdout.encoding at this point. I do recall getting different output encodings depending on how Python was invoked; I forget the pattern, but I also remember writing some ghastly hack to work around it, which I can't find at the moment... Also see "man python2.7" in particular the PYTHONIOENCODING environment variable. That might let you exert more control. Cheers, -- Cameron Simpson <c...@zip.com.au> ASCII n s. [from the greek] Those people who, at certain times of the year, have no shadow at noon; such are the inhabitatants of the torrid zone. - 1837 copy of Johnson's Dictionary -- https://mail.python.org/mailman/listinfo/python-list