Hi Ross. Thanks a lot for your clarifying. I didn't think my post could be an Unicode frame.
I don't know this mailing list is the right place talking about Unicode issue, but as for me, a million codespace which UTF-16 brings is not enough. It presume that same characters has a same codepoint. But differs from the simple and beauty Roman Alphabet, it is sometimes difficult to decide two kanji characters are "same" or not. Because its glyph swings with various reason(ex. who, when and where it's wrote). So first of all we assign codepoints, and next we consider that "this character which appears in this Chinese historical book may be the same character as this character in Unicode CJK Extension A". Such an identifying characters is also one of my project's tasks. I think this can be explanation why UTF-16 is enough for majority but not for all. Anyway, I suppose that implementing string-like classes is a generic python issue. For example, it will be useful if a rich text class which has style attributes like bold on each characters has also string-like methods and can be dealt with like a string. In article <[EMAIL PROTECTED]>, "Ross Ridge" <[EMAIL PROTECTED]> writes: rridge> thiking about it, it might actually make sense to use strings as the rridge> internal representation as a lot operations can be implemented by using rridge> the standard string operation but multipling the offsets and lengths by rridge> 4. Ah, COOL! It sounds very nice. I'll try it. Thanks again. -- kayama -- http://mail.python.org/mailman/listinfo/python-list