On 11/25/2012 07:48 AM, kobayashi wrote:
Encoding is utf-8.
I use "screen length" means as that; that of ascii character is 1, and that of
character having double width than ascii character is 2.
It's not bytes, but drawing width.
As you say, it depends font. I'll be considering carefully.
Don't forget also that there are combining characters. To wit:
>>> "\u00e1"
'á'
>>> "\u0061\u0301"
'á'
(U+00e1 is an 'a' with acute accent; U+0061 is an unaccented 'a'; U+0301
is an combining acute accent.)
So far the discussion has been on single Unicode code points which
appear as a double-wide glyph (I did not know about those!); depending
on how you want to look at it, combining characters result in sequences
of Unicode code points which result in a single glyph, or combining
characters are zero-width code points.
Evan
--
http://mail.python.org/mailman/listinfo/python-list