On Tue, Sep 11, 2007 at 12:30:51AM +0900, Tatsuo Ishii wrote: > Why do you think that employing the Unicode code point as the chr() > argument could avoid endianness issues? Are you going to represent > Unicode code point as UCS-4? Then you have to specify the endianness > anyway. (see the UCS-4 standard for more details)
Because the argument to chr() is an integer, which has no endian-ness. You only get into endian-ness if you look at how you store the resulting string. > Also I'd like to point out all encodings has its own code point > systems as far as I know. For example, EUC-JP has its corresponding > code point systems, ASCII, JIS X 0208 and JIS X 0212. So I don't see > we can't use "code point" as chr()'s argument for othe encodings(of > course we need optional parameter specifying which character set is > supposed). Oh, the last discussion on this didn't answer this question. Is there a standard somewhere that maps integers to characters in EUC-JP. If so, how can I find out what character 512 is? Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate.
signature.asc
Description: Digital signature