Re: [HACKERS] parallel workers and client encoding

Tatsuo Ishii Thu, 09 Jun 2016 16:18:04 -0700

> There appears to be a problem with how client encoding is handled in
> the communication from parallel workers.


Ouch.

>  In a parallel worker, the
> client encoding setting is inherited from its creating process as part
> of the GUC setup.  So any plain-text stuff the parallel worker sends
> to its leader is actually converted to the client encoding.  Since
> most data is sent in binary format, the plain-text provision applies
> mainly to notice and error messages.  At the other end, error messages
> are parsed using pq_parse_errornotice(), which internally uses
> routines that were meant for communication from the client, and
> therefore will convert everything back from the client encoding to the
> server encoding.  So this whole thing actually happens to work as long
> as round tripping is possible between the involved encodings.
> 
> In cases where it isn't, it's still hard to notice the difference
> because depending on whether you get a parallel plan or not, the
> following happens:
> 
> not parallel: conversion error happens between server and client,
> client sees an error message about that
> 
> parallel: conversion error happens between worker and leader, worker
> generates an error message about that, sends it to leader, leader
> forwards it to client
> 
> The client sees the same error message in both cases.
> 
> To construct a case where this makes a difference, the leader has to
> be set up to catch certain errors.  Here is an example:
> 
> """
> create table test1 (a int, b text);
> truncate test1;
> insert into test1 values (1, 'a');
> 
> create or replace function test1() returns text language plpgsql
> as $$
> declare
>   res text;
> begin
>   perform from test1 where a = test2();
>   return res;
> exception when division_by_zero then
>   return 'boom';
> end;
> $$;
> 
> create or replace function test2() returns int language plpgsql
> parallel safe
> as $$
> begin
>   raise division_by_zero using message = 'Motörhead';
>   return 1;
> end
> $$;
> 
> set force_parallel_mode to on;
> 
> select test1();
> """
> 
> With client_encoding = server_encoding, this will return a single row
> 'boom'.  But with, say, database encoding UTF8 and
> PGCLIENTENCODING=KOI8R, it will error:
> 
> ERROR: 22P05: character with byte sequence 0xef 0xbe 0x83 in encoding
> "UTF8" has no equivalent in encoding "KOI8R"
> CONTEXT:  parallel worker
> 
> (Note that changing force_parallel_mode does not force replanning in
> plpgsql, so if you run test1() first before setting
> force_parallel_mode, then you won't get the error.)
> 
> Attached is a patch to illustrates how this could be fixed.  There
> might be similar issues elsewhere.  The notification propagation in
> particular could be affected.

Something like SetClientEncoding(GetDatabaseEncoding()) is a Little
bit ugly. It would be nice if we could have a switch to turn off the
automatic encoding conversion in the future, but for 9.6, I feel I'm
fine with your proposed patch.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] parallel workers and client encoding

Reply via email to