On Tue, 2010-11-23 at 19:21 +0800, Senthil Kumaran wrote: > On Tue, Nov 23, 2010 at 7:05 PM, Kenneth Gonsalves <law...@au-kbc.org> > wrote: > > say I have an indic (tamil) string like நான். This is actually > > represented by the following: > > 0x0ba8,0x0bbe,0x0ba9,0x0bcd. > > Under which encoding?
I am not sure - this was what the unicode chart gives > > > How can I convert the above string into > > these characters - or at least into base 10 integers? > > If you use utf-8 encoding, it goes like this: > > # -*- coding: utf-8 -*- > str1 = u"நான்" > print repr(str1.encode('utf-8')) > > # output: > > '\xe0\xae\xa8\xe0\xae\xbe\xe0\xae\xa9\xe0\xaf\x8d' but this gives 12 groups - I require 4 -- regards Kenneth Gonsalves _______________________________________________ BangPypers mailing list BangPypers@python.org http://mail.python.org/mailman/listinfo/bangpypers