Marc-Andre Lemburg <m...@egenix.com> added the comment: On 2008-12-27 13:58, STINNER Victor wrote: > Python 2.x allows to encode any byte string (str) and ASCII unicode > string (unicode): > > $ python > Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) >>>> import zlib >>>> zlib.compress('abc') > "x\x9cKLJ\x06\x00\x02M\x01'" >>>> zlib.compress(u'abc') > "x\x9cKLJ\x06\x00\x02M\x01'" >>>> zlib.compress(u'abc\xe9') > ... > UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' ... > > I'm not sure that this behaviour was really wanted become the > decompress operation is not symetric (the result type is always byte > string): > > $ python > Python 2.5.1 (r251:54863, Jul 31 2008, 23:17:40) >>>> import zlib >>>> zlib.decompress("x\x9cKLJ\x06\x00\x02M\x01'") > 'abc' >
I don't see a problem with this. The fact that Python 2.x also accepts Unicode ASCII strings where strings are normally expected is intended to help with the migration to Unicode, so the above is expected. zlib itself doesn't care about whether the data to be encoded is text or bytes. In Python 3.x, it's probably better to use bytes throughout the API. ---------- nosy: +lemburg _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue4757> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com