J Smith <dark.panda+li...@gmail.com> writes: > I've attached a patch against master for unaccent.c that uses swscanf > along with char2wchar and wchar2char instead of sscanf directly to > initialize the unaccent extension and it appears to fix the problem in > both the master and 9.1 branches.
swscanf doesn't seem like an acceptable approach: it's a function that is relied on nowhere else in PG, so it adds new portability risks of its own. It doesn't exist on some platforms that we support (like the one I'm typing this message on) and there's no real good reason to assume that it's not broken in its own ways on others. If you really want to pursue this, I'd suggest parsing the line manually, perhaps via strchr searches for \t and \n. It likely wouldn't be very many more lines than what you've got here. However, the bigger picture is that OS X's UTF8 locales are broken through-and-through, and most of their other problems are not feasible to work around. So basically you can't use them for anything interesting, and it's not clear that it's worth putting any time into solving individual problems. In the particular case here, the issue presumably is that sscanf is relying on isspace() ... but we rely on isspace() directly, in quite a lot of places, so how much is it going to fix to dodge it right here? regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers