On Wed, Jan 30, 2013 at 10:00:01AM +0400, Alexander Law wrote: > 30.01.2013 05:51, Noah Misch wrote: >> On Tue, Jan 29, 2013 at 09:54:04AM -0500, Tom Lane wrote: >>> Alexander Law <exclus...@gmail.com> writes: >>>> Please look at the following l10n bug: >>>> http://www.postgresql.org/message-id/502a26f1.6010...@gmail.com >>>> and the proposed patch.
>> Even then, I wouldn't be surprised to find problematic consequences beyond >> error display. What if all the databases are EUC_JP, the platform encoding >> is >> KOI8, and some postgresql.conf settings contain EUC_JP characters? Does the >> postmaster not rely on its use of SQL_ASCII to allow those values? >> >> I would look at fixing this by making the error output machinery smarter in >> this area before changing the postmaster's notion of server_encoding. With your proposed change, the problem will resurface in an actual SQL_ASCII database. At the problem's root is write_console()'s assumption that messages are in the database encoding. pg_bind_textdomain_codeset() tries to make that so, but it only works for encodings with a pg_enc2gettext_tbl entry. That excludes SQL_ASCII, MULE_INTERNAL, and others. write_console() needs to behave differently in such cases. > Maybe I still miss something but I thought that > postinit.c/CheckMyDatabase will switch encoding of a messages by > pg_bind_textdomain_codeset to EUC_JP so there will be no issues with it. > But until then KOI8 should be used. > Regarding postgresql.conf, as it has no explicit encoding specification, > it should be interpreted as having the platform encoding. So in your > example it should contain KOI8, not EUC_JP characters. Following some actual testing, I see that we treat postgresql.conf values as byte sequences; any reinterpretation as encoded text happens later. Hence, contrary to my earlier suspicion, your patch does not make that situation worse. The present situation is bad; among other things, current_setting() is a vector for injecting invalid text data. But unconditionally validating postgresql.conf values in the platform encoding would not be an improvement. Suppose you have a UTF-8 platform encoding and KOI8R databases. You may wish to put KOI8R strings in a GUC, say search_path. That's possible today; if we required that postgresql.conf conform to the platform encoding and no other, it would become impossible. This area warrants improvement, but doing so will entail careful design. Thanks, nm -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers