Re: unicode, bytes redux

Leif K-Brooks Mon, 25 Sep 2006 01:15:55 -0700

Paul Rubin wrote:
> Duncan Booth explains why that doesn't work.  But I don't see any big
> problem with a byte count function that lets you specify an encoding:
> 
>      u = buf.decode('UTF-8')
>      # ... later ...
>      u.bytes('UTF-8') -> 3
>      u.bytes('UCS-4') -> 4
> 
> That avoids creating a new encoded string in memory, and for some
> encodings, avoids having to scan the unicode string to add up the
> lengths.


It requires a fairly large change to code and API for a relatively 
uncommon problem. How often do you need to know how many bytes an 
encoded Unicode string takes up without needing the encoded string itself?
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: unicode, bytes redux

Reply via email to