Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-10 Thread Heikki Linnakangas
On 04.06.2013 09:39, Martin Schäfer wrote: Can't really blame Windows on that. On Windows, we don't require that the encoding and LC_CTYPE's charset match. The OP used UTF-8 encoding in the server, but LC_CTYPE="English_United Kingdom.1252", ie. LC_CTYPE implies WIN1252 encoding. We allow that an

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-10 Thread Martin Schäfer
: Re: [HACKERS] UTF-8 encoding problem w/ libpq > > > On 06/03/2013 02:41 PM, Andrew Dunstan wrote: > > > > On 06/03/2013 02:28 PM, Tom Lane wrote: > >> . I wonder though if we couldn't just fix this code to not do > >> anything to high-bit-set bytes in

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-08 Thread Andrew Dunstan
On 06/03/2013 02:41 PM, Andrew Dunstan wrote: On 06/03/2013 02:28 PM, Tom Lane wrote: . I wonder though if we couldn't just fix this code to not do anything to high-bit-set bytes in multibyte encodings. That's exactly what I suggested back in November. This thread seems to have gone cold

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Martin Schäfer
> Can't really blame Windows on that. On Windows, we don't require that the > encoding and LC_CTYPE's charset match. The OP used UTF-8 encoding in the > server, but LC_CTYPE="English_United Kingdom.1252", ie. LC_CTYPE implies > WIN1252 encoding. We allow that and it generally works on Windows > bec

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Heikki Linnakangas
On 03.06.2013 21:28, Tom Lane wrote: Heikki Linnakangas writes: He *is* using UTF-8. Or trying to, anyway :-). The downcasing in the backend is supposed to leave bytes with the high-bit set alone, ie. in UTF-8 encoding, it's supposed to leave ä and ß alone. Well, actually, downcase_truncate

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Andrew Dunstan
On 06/03/2013 02:28 PM, Tom Lane wrote: . I wonder though if we couldn't just fix this code to not do anything to high-bit-set bytes in multibyte encodings. That's exactly what I suggested back in November. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.or

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Tom Lane
Heikki Linnakangas writes: > He *is* using UTF-8. Or trying to, anyway :-). The downcasing in the > backend is supposed to leave bytes with the high-bit set alone, ie. in > UTF-8 encoding, it's supposed to leave ä and ß alone. Well, actually, downcase_truncate_identifier() is doing this:

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Andrew Dunstan
On 06/03/2013 12:22 PM, Heikki Linnakangas wrote: On 03.06.2013 18:27, k...@rice.edu wrote: On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote: If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is alr

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Heikki Linnakangas
On 03.06.2013 18:27, k...@rice.edu wrote: On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote: If I change the strCreate query and add double quotes around the column name, then the problem disappears. But the original name is already in lowercase, so I think it should also work wi

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread k...@rice.edu
On Mon, Jun 03, 2013 at 04:09:29PM +0100, Martin Schäfer wrote: > > > > If I change the strCreate query and add double quotes around the column > > name, then the problem disappears. But the original name is already in > > lowercase, so I think it should also work without quoting the column name.

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Martin Schäfer
> -Original Message- > From: k...@rice.edu [mailto:k...@rice.edu] > Sent: 03 June 2013 16:48 > To: Martin Schäfer > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] UTF-8 encoding problem w/ libpq > > On Mon, Jun 03, 2013 at 03:40:14PM +0100, Martin Sch

Re: [HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread k...@rice.edu
On Mon, Jun 03, 2013 at 03:40:14PM +0100, Martin Schäfer wrote: > I try to create database columns with umlauts, using the UTF8 client > encoding. However, the server seems to mess up the column names. In > particular, it seems to perform a lowercase operation on each byte of the > UTF-8 multi-b

[HACKERS] UTF-8 encoding problem w/ libpq

2013-06-03 Thread Martin Schäfer
I try to create database columns with umlauts, using the UTF8 client encoding. However, the server seems to mess up the column names. In particular, it seems to perform a lowercase operation on each byte of the UTF-8 multi-byte sequence. Here is my code: const wchar_t *strName = L"id_äß";