On Aug 19, 2010, at 3:24 PM, Tom Lane wrote: > Steven Schlansker <ste...@trumpet.io> writes: >> >> I'm not at all experienced with character encodings so I could >> be totally off base, but isn't it wrong to ever call isspace(0x85), >> whatever the result may be, given that the actual character is 0xCF85? >> (U+03C5, GREEK SMALL LETTER UPSILON) > > We generally assume that in server-safe encodings, the ctype.h functions > will behave sanely on any single-byte value. You can argue the wisdom > of that, but deciding to change that policy would be a rather massive > code change; I'm not excited about going that direction.
Fair enough. I presume there are no "server-safe encodings" for which a multibyte sequence 0x XX20 would be valid - which would break anyway (as the second byte looks like a real space) > You need a setlocale() call, else the program acts as though it's in C > locale regardless of environment. Sigh. I hate C sometimes. :-p Anyway, it looks like this is actually a BSD bug which got copy + pasted into Apple's Darwin source - http://lists.freebsd.org/pipermail/freebsd-i18n/2007-September/000157.html I have a couple of contacts at Apple so I'll see if there's any interest in backporting a fix, but I wouldn't hope for it to happen quickly if at all... Thanks for taking a look into fixing this, I hope you guys can reach consensus on how to get it fixed :) Best, Steven Schlansker -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs