I am currently writing a python interface to a C++ library. Some of the functions in this library take unicode strings (UTF-8, mostly) as arguments.
However, when getting these data I run into problem on python 2.2 (RHEL3) - while the data is all nice UCS4 in 2.3, in 2.2 it seems to be UTF-8 on top of UCS4. UTF8 encoded in UCS4, meaning that 3 bytes of the UCS4 char is 0 and the first one contains a byte of the string encoding in UTF-8. Is there a trick to get python 2.2 to do UCS4 more cleanly? -- Trond Eivind Glomsrød Senior Software Engineer Scali - www.scali.com Scaling the Linux Datacenter -- http://mail.python.org/mailman/listinfo/python-list