On 28.03.25 09:13, Peter Eisentraut wrote:
About patch 0003:

I had previously pointed out that the canonicalization might have been intentional, and that we have canonicalization of ICU locale names. But we don't have to keep the setlocale()-based locale checking implementation just for that, I think.  (If this was meant to be a real feature offered by libc, there should be a get_locale_name(locale_t) function.)

Anyway, this patch fails tests on CI on NetBSD, so it will need some further investigation.

It turns out the problem was that nl_langinfo_l() returns a codeset name of "646" for the C locale, which we didn't have in our mapping table. If we add that, then everything passes there as well.  (But the use of nl_langinfo_l() wasn't added by this patch, it just exposes it to the tests.  So this is apparently a pre-existing problem.)

Further analysis:

(But I have not tested any of this.)

It appears that the new implementation of check_locale() provided by this patch, using newlocale() instead of setlocale(), works differently on NetBSD. Specifically, it apparently does not catch garbage locale names. Instead, it just assumes they are C locale. Then, in pg_get_encoding_from_locale(), we have special cases mapping "C" and "POSIX" to SQL_ASCII. But as discussed, with this patch, we no longer do canonicalization of the passed-in locale name, so if you pass a garbage locale name, it will not match "C" or "POSIX". Then, you fall through to the mapping table, and that's where we get the error about the missing "646" entry. That's why this was not a problem before, even though we've used nl_langinfo_l() for a few months and nl_langinfo() before that.

I'm not sure what to do with this. If setlocale() and newlocale() indeed behave differently in what set of locale names they accept, then technically we ought to test both of them, since we do use both of them later on. Or maybe we push on with the effort to get rid of setlocale() calls and then just worry about testing newlocale() (as this patch does). But right now, if newlocale() is more permissive, then we could accept locale names that will later fail setlocale() calls, which might be a problem.




Reply via email to