Tom Lane <[EMAIL PROTECTED]> writes:
>> Agreed, but such corruption indicates that there is non-multibyte-safe
>> (octet-wise) case conversion somewhere, at best (with fully working
>> locale) it will cause case conversion to do nothing instead of actual
>> conversion.
>
> Yours is the first insta
Victor Snezhko <[EMAIL PROTECTED]> writes:
> Agreed, but such corruption indicates that there is non-multibyte-safe
> (octet-wise) case conversion somewhere, at best (with fully working
> locale) it will cause case conversion to do nothing instead of actual
> conversion.
Yours is the first install
Tom Lane <[EMAIL PROTECTED]> writes:
>> correct utf-8 byte sequence is 0xd18231, so it looks like we call
>> tolower() somewhere on parts of multibyte characters, and it does the
>> same as isspace() - it interprets it's argument as wide character, and
>> converts it.
>
> Indeed, and I am certainl
Victor Snezhko <[EMAIL PROTECTED]> writes:
> correct utf-8 byte sequence is 0xd18231, so it looks like we call
> tolower() somewhere on parts of multibyte characters, and it does the
> same as isspace() - it interprets it's argument as wide character, and
> converts it.
Indeed, and I am certainly
Victor Snezhko <[EMAIL PROTECTED]> writes:
> So, we either don't support utf-8 on BSDs
Hmm, tolower'ing octets of a multibyte string is a bug not only on
BSDs but on other architectures as well. But on BSDs it additionally
causes corruption of utf-8 data.
--
WBR, Victor V. Snezhko
E-mail: [EMAI