Tom Lane <[EMAIL PROTECTED]> writes: >> correct utf-8 byte sequence is 0xd18231, so it looks like we call >> tolower() somewhere on parts of multibyte characters, and it does the >> same as isspace() - it interprets it's argument as wide character, and >> converts it. > > Indeed, and I am certainly wondering why we should not just say that > you've got a broken locale definition there. There is absolutely no > doubt that the ctype.h functions are defined to work on char, not > wchar.
Agreed, but such corruption indicates that there is non-multibyte-safe (octet-wise) case conversion somewhere, at best (with fully working locale) it will cause case conversion to do nothing instead of actual conversion. > They have no business mangling high-bit-set bytes in a multibyte > encoding. -- WBR, Victor V. Snezhko E-mail: [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 5: don't forget to increase your free space map settings