On Jun 27, 12:20 am, Andy <[EMAIL PROTECTED]> wrote: > Hi guys, > > I'm writing a piece of software for some Thai friend. At the end it > is supposed to print on paper some report with tables of text and > numbers. When I test it in English, the columns are aligned nicely, > but when he tests it with Thai data, the columns are all crooked. > > The problem here is that in the Thai writing system some times two or > more characters together might take one single space, for example งิ > (u"\u0E07\u0E34"). This is why when I use something like u"%10s" > % ..., it just doesn't work as expected. > > Is anybody aware of an alternative string format function that can > deal with this kind of writing properly?
In general case it's impossible to write such a function for many unicode characters without feedback from rendering library. Assuming you use *fixed* font for English and Thai the following function will return how many columns your text will use: from unicodedata import category def columns(self, s): return sum(1 for c in s if category(c) != 'Mn') -- Leo -- http://mail.python.org/mailman/listinfo/python-list