> The Unicode standard doesn't require that you support surrogates, or > any other kind of character, so no you wouldn't be lying.
There is the notion of Unicode implementation levels, and each of them does include a set of characters to support. In level 1, combining characters need not to be supported (which is sufficient for scripts that can be represented without combining characters, such as Latin and Cyrillic, using precomposed characters if necessary). In level 2, combining characters must be supported for some scripts that absolutely need them, and in level 3, all characters must be supported. It is probably an interpretation issue what "supported" means. Python clearly supports Unicode level 1 (if we leave alone the issue that it can't render all these characters out of the box, as it doesn't ship any fonts); it could be argued that it implements level 3, as it is capable of representing all Unicode characters (but, of course, so does Python 1.5.2, if you put UTF-8 into byte strings). Regards, Martin -- http://mail.python.org/mailman/listinfo/python-list