On Feb 13, 2006, at 7:09 PM, Guido van Rossum wrote:
> On 2/13/06, Michael Foord <[EMAIL PROTECTED]> wrote:
>> Sorry - I meant for the unicode to bytes case. A default encoding
>> that
>> behaves differently to the current to implicit encodes/decodes
>> would be
>> confusing IMHO.
>
> And I am in agreement with you there (I think only Phillip argued
> otherwise).
>
>> I agree that string to bytes shouldn't change the value of the bytes.
>
> It's a deal then.
>
> Can the owner of PEP 332 update the PEP to record these decisions?
So, in python2.X, you have:
- bytes("\x80"), you get a bytestring with a single byte of value
0x80 (when no encoding is specified, and the object is a str, it
doesn't try to encode it at all).
- bytes("\x80", encoding="latin-1"), you get an error, because
encoding "\x80" into latin-1 implicitly decodes it into a unicode
object first, via the system-wide default: ascii.
- bytes(u"\x80"), you get an error, because the default encoding for
a unicode string is ascii.
- bytes(u"\x80", encoding="latin-1"), you get a bytestring with a
single byte of value 0x80.
In py3k, when the str object is eliminated, then what do you have?
Perhaps
- bytes("\x80"), you get an error, encoding is required. There is no
such thing as "default encoding" anymore, as there's no str object.
- bytes("\x80", encoding="latin-1"), you get a bytestring with a
single byte of value 0x80.
James
_______________________________________________
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com