On Fri, 09 Nov 2012 23:22:04 +1100, Chris Angelico wrote: > On Fri, Nov 9, 2012 at 10:08 PM, Helmut Jarausch > <jarau...@igpm.rwth-aachen.de> wrote: >> For me it's not funny, at all. > > His description "funny" was in reference to the fact that you > described this as a bug. This is a heavily-used mature language; bugs > as fundamental as you imply are unlikely to exist (consequences of > design decisions there will be, but not outright bugs, usually); > extraordinary claims require extraordinary evidence.
Just for the record. I first discovered a real bug with Python3 when using os.walk on a file system containing non-ascii characters in file names. I encountered a very strange behavior (I still would call it a bug) when trying to put non-ascii characters in email headers. This has only been solved satisfactorily in Python3.3. > >> Whenever Python3 encounters a bytestring it needs an encoding to convert it >> to >> a string. If I feed a list of bytestrings or a list of list of bytestrings to >> 'str' , etc, it should use the encoding for each bytestring component of the >> given data structure. >> >> How can I convert a data strucure of arbitrarily complex nature, which >> contains >> bytestrings somewhere, to a string? > > Okay, now we're getting somewhere. > > What you really should be doing is not transforming the whole > structure, but explicitly transforming each part inside it. I > recommend you stop fighting the language and start thinking about your > data as either *bytes* or *characters* and using the appropriate data > types (bytes or str) everywhere. You'll then find that it makes > perfect sense to explicitly translate (en/decode) from one to another, > but it doesn't make sense to encode a list in UTF-8 or decode a > dictionary from Latin-1. > >> This problem has arisen while converting a working Python2 script to >> Python3.3. >> Since Python2 doesn't have bytestrings it just works. > > Actually it does; it just calls them "str". And there's a Unicode > string type, called "unicode", which is (more or less) the thing that > Python 3 calls "str". > > You may be able to do some kind of recursive cast that, in one sweep > of your data structure, encodes all str objects into bytes using a > given encoding (or the reverse thereof). But I don't think this is the > best way to do things. Thanks, but in my case the (complex) object is returned via ctypes from the aspell library. I still think that a standard function in Python3 which is able to 'stringify' objects should take an encoding parameter. Thanks, Helmut. -- http://mail.python.org/mailman/listinfo/python-list