STINNER Victor <victor.stin...@haypocalc.com> added the comment:

This "feature" was introduced in a big commit from Guido van Rossum (made 
before Python 3.0): r55500. The changelog is strange because it starts with 
"Make test_zipfile pass. The zipfile module now does all I/O in binary mode 
using bytes." but ends with "The _struct needed a patch to support bytes, str8 
and str for the 's' and 'p' formats.". Why was _struct patched at the same time?

Implicit conversion bytes and str is a very bad idea, it is the root of all 
confusion related to Unicode. The experience with Python 2 demonstrated that it 
should be changed, and it was changed in Python 3.0. But "Python 3.0" is a big 
project, it has many modules. Some modules were completly broken in Python 3.0, 
it works better with 3.1, and we hope that it will be even better with 3.2.

Attached patch removes the implicit conversion for 'c', 's' and 'p' formats. I 
did a similar change in ctypes, 5 months ago: issue #8966.

If a program written for Python 3.1 fails because of the patch, it can use 
explicit conversion to stay compatible with 3.1 and 3.2 (patched). I think that 
it's better to use explicit conversion.

Implicit conversion on 'c' format is really weird and it was not documented 
correctly: the note (1) is attached to "b" format, not to the "c" format. 
Example:

   >>> struct.pack('c', 'é')
   struct.error: char format requires bytes or string of length 1
   >>> len('é')
   1

There is also a length issue with the s format: struct.pack() truncates unicode 
string to a length in bytes, not in character, it is confusiong.

  >>> struct.pack('2s', 'ha')
   b'ha'
   >>> struct.pack('2s', 'hé')
   b'h\xc3'
   >>> struct.pack('3s', 'hé')
   b'h\xc3\xa9'

Finally, I don't like implicit conversion from unicode to bytes on pack, 
because it's not symmetrical.

   >>> struct.pack('3s', 'hé')
   b'h\xc3\xa9'
   >>> struct.unpack('3s', b'h\xc3\xa9')
   (b'h\xc3\xa9',)

(str -> pack() -> unpack() -> bytes)

----------
keywords: +patch
nosy: +haypo
Added file: http://bugs.python.org/file20175/struct.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10783>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to