Marc-Andre Lemburg <[EMAIL PROTECTED]> added the comment: On 2008-05-06 00:07, Guido van Rossum wrote: > Guido van Rossum <[EMAIL PROTECTED]> added the comment: > > On Fri, Apr 18, 2008 at 1:46 AM, Marc-Andre Lemburg > <[EMAIL PROTECTED]> wrote: >> On 2008-04-18 05:35, atsuo ishimoto wrote: >> > atsuo ishimoto <[EMAIL PROTECTED]> added the comment: >> > >> > Is a codec which encode() returns an Unicode allowed in Python3? >> >> Sure, why not ? > > Actually, it is not. In Py3k, x.encode() always requires x to be a str > (i.e. unicode) instance and return a bytes instance. y.decode() > requires y to be a bytes instance and returns a str (i.e. unicode) > instance.
So you've limited the codec design to just doing Unicode<->bytes conversions ? The original codec design was to have the codec decide which types to take on input and to generate on output, e.g. to escape characters in Unicode (converting Unicode to Unicode), work on compressed 8-bit strings (converting 8-bit strings to 8-bit strings), etc. >> I think you have to ask another question: Is repr() allowed to >> return a string (instead of Unicode) in Py3k ? > > In Py3k, "strings" *are* unicode. The str data type is Unicode. With "strings" I always refer to 8-bit strings, ie. 8-bit data that is encoded in some encoding. > If you're asking about repr() possibly returning a bytes instance, > definitely not. > >> If not, then unicode_repr() will have to check the return value of >> the codec and convert it back to Unicode as necessary. > > What codec? The idea is to have a codec which takes the Unicode object and converts it to its repr()-value. Now, since you apparently cannot go the direct way anymore (ie. have the codec encode Unicode to Unicode), you'd have to first use a codec which converts the Unicode object to its repr()-value represented as bytes object and then convert the bytes object back to Unicode in unicode_repr(). With the original design, this extra step wouldn't have been necessary. >> > I started to think codec is not nessesary, but python function is enough. >> >> That's what we currently have with unicode_repr(), but it doesn't >> solve the problem. > > I'm lost here. See my previous replies on this ticket. > PS. Atsuo's PEP has now been checked in as PEP 3138. Discussion should > start soon on the python-3000 list. __________________________________ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2630> __________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com