Hello,
Alexander Law <exclus...@gmail.com> writes:
Please look at the following l10n bug:
http://www.postgresql.org/message-id/502a26f1.6010...@gmail.com
and the proposed patch.
With your proposed change, the problem will resurface in an actual SQL_ASCII
database. At the problem's root is write_console()'s assumption that messages
are in the database encoding. pg_bind_textdomain_codeset() tries to make that
so, but it only works for encodings with a pg_enc2gettext_tbl entry. That
excludes SQL_ASCII, MULE_INTERNAL, and others. write_console() needs to
behave differently in such cases.
Thank you for the notice. So it seems that "DatabaseEncoding" variable
alone can't present a database encoding (for communication with a
client) and current process messages encoding (for logging messages) at
once. There should be another variable, something like
CurrentProcessEncoding, that will be set to OS encoding at start and can
be changed to encoding of a connected database (if
bind_textdomain_codeset succeeded).
On Tue, Feb 12, 2013 at 03:22:17AM +0000, Greg Stark wrote:
>But that said I'm not sure saying the whole file is in an encoding is
>the right approach. Paths are actually binary strings. any encoding is
>purely for display purposes anyways.
For Unix, yes. On Windows, they're ultimately UTF16 strings; some system APIs
accept paths in the Windows ANSI code page and convert to UTF16 internally.
Nonetheless, good point.
Yes, and if postresql.conf not going to be UTF16 encoded, it seems
natural to use ANSI code page on Windows to write such paths in it.
So the paths should be written in OS encoding, which is accepted by OS
functions, such as fopen. (This is what we have now.)
And it seems too complicated to have different encodings in one file. Or
maybe path parameters should be separated from the others, for which OS
encoding is undesirable.
If we knew that postgresql.conf was stored in, say, UTF8, then it would
probably be possible to perform encoding conversion to get string
variables into the database encoding. Perhaps we should allow some
magic syntax to tell us the encoding of a config file?
file_encoding = 'utf8' # must precede any non-ASCII in the file
If we're going to do that we might as well use the Emacs standard
-*-coding: latin-1;-*-
Explicit encoding specification such as these (or even <?xml
version="1.0" encoding="utf-8"?>) can be useful but what encoding to
assume without it? For XML (without BOM) it's UTF-8, for emacs it
depends on it's language environment.
If postgresql.conf doesn't have to be portable (as XML), then IMO OS
encoding is the right choice for it.
Best regards,
Alexander