Hi Ross. 

Thanks a lot for your clarifying. I didn't think my post could be an
Unicode frame. 

I don't know this mailing list is the right place talking about
Unicode issue, but as for me, a million codespace which UTF-16 brings
is not enough. It presume that same characters has a same codepoint.
But differs from the simple and beauty Roman Alphabet, it is sometimes
difficult to decide two kanji characters are "same" or not. Because
its glyph swings with various reason(ex. who, when and where it's
wrote). So first of all we assign codepoints, and next we consider
that "this character which appears in this Chinese historical book may
be the same character as this character in Unicode CJK Extension
A". Such an identifying characters is also one of my project's tasks.
I think this can be explanation why UTF-16 is enough for majority but
not for all.

Anyway, I suppose that implementing string-like classes is a generic
python issue. For example, it will be useful if a rich text class
which has style attributes like bold on each characters has also
string-like methods and can be dealt with like a string.

In article <[EMAIL PROTECTED]>,
"Ross Ridge" <[EMAIL PROTECTED]> writes:

rridge> thiking about it, it might actually make sense to use strings as the
rridge> internal representation as a lot operations can be implemented by using
rridge> the standard string operation but multipling the offsets and lengths by
rridge> 4.

Ah, COOL! It sounds very nice. I'll try it.
Thanks again.

-- kayama
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to