[HACKERS] Encoding conversions in psql

Mathijs Brands Sun, 11 Jan 2004 12:39:43 -0800

Howdy,

Can anyone explain to me when psql tries to convert between encodings?
It seems to disregard encodings set with SET CLIENT_ENCODING.


The following reproduces the behaviour I'm seeing:

1. create an UNICODE database

2. run the following:
     set client_encoding to latin1;
     create table bla(a text);
     insert into bla values('meëep');

3. try the following from psql:
     Welcome to psql 7.3.4, the PostgreSQL interactive terminal.
     
     Type:  \copyright for distribution terms
            \h for help with SQL commands
            \? for help on internal slash commands
            \g or terminate with semicolon to execute query
            \q to quit
     
     mathijs=# select * from bla;
        a
     -------
      meÃ«ep
     (1 row)
     
     mathijs=# set client_encoding = latin1;
     SET
     mathijs=# select * from bla;
       a
     ------
      meep
     (1 row)
     
     mathijs=# \encoding latin1
     mathijs=# select * from bla;
        a
     -------
      meëep
     (1 row)
 
After setting CLIENT_ENCODING, the middle character gets dropped. To me
it seems like psql is considering the data it gets from the server as
UTF8, tries to interpret it as UTF8, sees the ë (which is indeed an
invalid UTF8 character) and drops it.

My question is: why does psql seem to think it's receiving UTF8 data
-after- I've changed the client_encoding. I've checked with a network
sniffer that results returned with or without using \encoding (as
expected) are the same. Is this behaviour a bug? If not, it does not
seem very obvious to me; I would expect psql to keep track of the
encoding set between the server and the client.

Cheers,

Mathijs

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

[HACKERS] Encoding conversions in psql

Reply via email to