[issue5093] 2to3 with a pipe on non-ASCII script

James Abbatiello Fri, 31 Jul 2009 11:02:57 -0700

James Abbatiello <abb...@gmail.com> added the comment:

In what case(s) do you propose the output to be encoded in UTF-8?  If
output is to a terminal and that terminal is set to Latin-1 or cp437 or
whatever then outputting UTF-8 in that case will only show garbage
characters to the user.


If output is to a file then using the encoding of the input file makes
the most sense to me.  Assume you have a simple program encoded in
Latin-1 that prints out a string with some non-ASCII characters.  The
patch is printed in UTF-8 encoding and redirected to a file.  The patch
program has no idea what encodings are used and it will just compare the
bytes in the original to the bytes in the patch file.  These won't match
since the encodings are different and he patch will fail.

If the output is to a pipe then I'm not sure what the right thing is. 
It may be intended for display on the screen with something like `less`
or it may not.  I don't think there's a good solution for this.

So following the above logic the patch attached here does the following:
1) If output is to a terminal (sys.stdout.encoding is set) then use that
encoding for output
2) Otherwise if an encoding was determined for the input file, use that
encoding for output
3) If all else fails, use 'ascii' encoding.  If the input contained
non-ASCII characters and no encoding has been determined for the input
then this will cause an exception to be raised.  I think this can only
happen when reading the input file from stdin.  Perhaps that case needs
to be looked at for how to detect the encoding of stdin.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue5093>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue5093] 2to3 with a pipe on non-ASCII script

Reply via email to