In article <mailman.1988.1281579897.1673.python-l...@python.org>, Benjamin Kaplan <benjamin.kap...@case.edu> wrote:
> On Wed, Aug 11, 2010 at 6:21 PM, RG <rnospa...@flownet.com> wrote: > > I thought it was hard-coded into the Python executable at compile time, > > but that is apparently not the case: > > > > [...@mickey:~]$ python > > Python 2.6.1 (r261:67515, Feb 11 2010, 00:51:29) > > [GCC 4.2.1 (Apple Inc. build 5646)] on darwin > > Type "help", "copyright", "credits" or "license" for more information. > >>>> import sys;print sys.stdin.encoding > > UTF-8 > >>>> ^D > > [...@mickey:~]$ echo 'import sys;print sys.stdin.encoding' | python > > None > > [...@mickey:~]$ > > > > And indeed, trying to pipe unicode into Python doesn't work, even though > > it works fine when Python runs interactively. So how can I make this > > work? > > > > Sys.stdin and stdout are files, just like any other. There's nothing > special about them at compile time. When the interpreter starts, it > checks to see if they are ttys. If they are, then it tries to figure > out the terminal's encoding based on the environment. The code for > this is in pythonrun.c if you want to see exactly what it's doing. Thanks. Looks like the magic incantation is: export PYTHONIOENCODING='utf-8' > By the way, there is no such thing as piping Unicode into Python. Yeah, I know. I should have said "piping UTF-8 encoded unicode" or something like that. > You really have to watch your encodings > when you pass data around between programs. There's no way to avoid > it. Yeah, I keep re-learning that lesson again and again. rg -- http://mail.python.org/mailman/listinfo/python-list