Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-06-17 Thread Michael Paquier
On Thu, Jun 18, 2015 at 9:47 AM, Noah Misch wrote: > On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote: >> On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier >> wrote: >> > On Sun, May 24, 2015 at 2:43 AM, Noah Misch wrote: >> > > It would be good to purge the code of precisions on

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-06-17 Thread Noah Misch
On Wed, Jun 03, 2015 at 05:25:45PM +0900, Michael Paquier wrote: > On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier > wrote: > > On Sun, May 24, 2015 at 2:43 AM, Noah Misch wrote: > > > It would be good to purge the code of precisions on "s" conversion > > > specifiers, > > > then Assert(!point

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-06-03 Thread Michael Paquier
On Tue, Jun 2, 2015 at 4:19 PM, Michael Paquier wrote: > On Sun, May 24, 2015 at 2:43 AM, Noah Misch wrote: > > It would be good to purge the code of precisions on "s" conversion > specifiers, > > then Assert(!pointflag) in fmtstr() to catch new introductions. I won't > plan > > to do it mysel

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-06-02 Thread Michael Paquier
On Sun, May 24, 2015 at 2:43 AM, Noah Misch wrote: > On Sat, May 08, 2010 at 09:24:45PM -0400, Tom Lane wrote: >> hgonza...@gmail.com writes: >> > http://sources.redhat.com/bugzilla/show_bug.cgi?id=649 >> >> > The last explains why they do not consider it a bug: >> >> > ISO C99 requires for %.*s t

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2015-05-23 Thread Noah Misch
On Sat, May 08, 2010 at 09:24:45PM -0400, Tom Lane wrote: > hgonza...@gmail.com writes: > > http://sources.redhat.com/bugzilla/show_bug.cgi?id=649 > > > The last explains why they do not consider it a bug: > > > ISO C99 requires for %.*s to only write complete characters that fit below > > the

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-08 Thread Tom Lane
hernan gonzalez writes: > BTW, I understand that postgresql uses locale semantics in the server code. > But is this really necessary/appropiate in the client (psql) side? > Couldnt we stick with C locale here? As far as that goes, I think we have to turn on that machinery in order to have gettext

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-08 Thread Tom Lane
hgonza...@gmail.com writes: > http://sources.redhat.com/bugzilla/show_bug.cgi?id=649 > The last explains why they do not consider it a bug: > ISO C99 requires for %.*s to only write complete characters that fit below > the > precision number of bytes. If you are using say UTF-8 locale, but ISO-

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-08 Thread hgonzalez
Well, I finally found some related -rather old- issues in Bugzilla (glib) http://sources.redhat.com/bugzilla/show_bug.cgi?id=6530 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=208308 http://sources.redhat.com/bugzilla/show_bug.cgi?id=649 The last explains why they do not consider it a bug: I

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-08 Thread hernan gonzalez
Wow, you are right, this is bizarre... And it's not that glibc intends to compute the length in unicode chars, it actually counts bytes (c plain chars) -as it should- for computing field widths... But, for some strange reason, when there is some width calculation involved it tries so parse the cha

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-08 Thread Tom Lane
hernan gonzalez writes: > Sorry about a error in my previous example (mixed width and precision). > But the conclusion is the same - it works on bytes: This example works like that because it's running in C locale always. Try something like this: #include #include int main () { char s[]

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-07 Thread hernan gonzalez
Sorry about a error in my previous example (mixed width and precision). But the conclusion is the same - it works on bytes: #include main () { char s[] = "ni\xc3\xb1o"; /* 5 bytes , 4 utf8 chars */ printf("|%*s|\n",6,s); /* this should pad a black */ printf("|%.*s|\n",4,s);

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-07 Thread hgonzalez
However, it appears that glibc's printf code interprets the parameter as the number of *characters* to print, and to determine what's a character it assumes the string is in the environment LC_CTYPE's encoding. Well, I myself have problems to believe that :-) This would be nasty... Are you sure?

Re: [HACKERS] [GENERAL] psql weird behaviour with charset encodings

2010-05-07 Thread Tom Lane
hernan gonzalez writes: > The issue is that psql tries (apparently) to convert to UTF8 > (even when he plans to output the raw text -LATIN9 in this case) > just for computing the lenght of the field, to build the table. > And because for this computation he (apparently) rely on the string > routin