Re: Re: How to get a "screen" length of a multibyte string?

Evan Driscoll Sun, 25 Nov 2012 17:02:20 -0800

On 11/25/2012 07:48 AM, kobayashi wrote:

Encoding is utf-8.
I use "screen length" means as that; that of ascii character is 1, and that of 
character having double width than ascii character is 2.
It's not bytes, but drawing width.
As you say, it depends font. I'll be considering carefully.


Don't forget also that there are combining characters. To wit:

>>> "\u00e1"
'á'
>>> "\u0061\u0301"
'á'

(U+00e1 is an 'a' with acute accent; U+0061 is an unaccented 'a'; U+0301is an combining acute accent.)

So far the discussion has been on single Unicode code points whichappear as a double-wide glyph (I did not know about those!); dependingon how you want to look at it, combining characters result in sequencesof Unicode code points which result in a single glyph, or combiningcharacters are zero-width code points.


Evan

--
http://mail.python.org/mailman/listinfo/python-list

Re: Re: How to get a "screen" length of a multibyte string?

Reply via email to