In article <mailman.3531.1345416176.4697.python-l...@python.org>,
 Chris Angelico <ros...@gmail.com> wrote:

> Really, the only viable alternative to PEP 393 is a fixed 32-bit
> representation - it's the only way that's guaranteed to provide
> equivalent semantics. The new storage format is guaranteed to take no
> more memory than that, and provide equivalent functionality.

In the primordial days of computing, using 8 bits to store a character 
was a profligate waste of memory.  What on earth did people need with 
TWO cases of the alphabet (not to mention all sorts of weird 
punctuation)?  Eventually, memory became cheap enough that the 
convenience of using one character per byte (not to mention 8-bit bytes) 
outweighed the costs.  And crazy things like sixbit and rad-50 got swept 
into the dustbin of history.

So it may be with utf-8 someday.

Clearly, the world has moved to a 32-bit character set.  Not all parts 
of the world know that yet, or are willing to admit it, but that doesn't 
negate the fact that it's true.  Equally clearly, the concept of one 
character per byte is a big win.  The obvious conclusion is that 
eventually, when memory gets cheap enough, we'll all be doing utf-32 and 
all this transcoding nonsense will look as antiquated as rad-50 does 
today.
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to