Re: String formatting for complex writing systems

Leo Kislov Wed, 27 Jun 2007 03:16:11 -0700

On Jun 27, 12:20 am, Andy <[EMAIL PROTECTED]> wrote:
> Hi guys,
>
> I'm writing a piece of software for some Thai friend.  At the end it
> is supposed to print on paper some report with tables of text and
> numbers.  When I test it in English, the columns are aligned nicely,
> but when he tests it with Thai data, the columns are all crooked.
>
> The problem here is that in the Thai writing system some times two or
> more characters together might take one single space, for example งิ
> (u"\u0E07\u0E34").  This is why when I use something like u"%10s"
> % ..., it just doesn't work as expected.
>
> Is anybody aware of an alternative string format function that can
> deal with this kind of writing properly?


In general case it's impossible to write such a function for many
unicode characters without feedback from rendering library.
Assuming you use *fixed* font for English and Thai the following
function will return how many columns your text will use:

from unicodedata import category
def columns(self, s):
    return sum(1 for c in s if category(c) != 'Mn')

  -- Leo

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: String formatting for complex writing systems

Reply via email to