> There appears to be a problem with how client encoding is handled in > the communication from parallel workers.
Ouch. > In a parallel worker, the > client encoding setting is inherited from its creating process as part > of the GUC setup. So any plain-text stuff the parallel worker sends > to its leader is actually converted to the client encoding. Since > most data is sent in binary format, the plain-text provision applies > mainly to notice and error messages. At the other end, error messages > are parsed using pq_parse_errornotice(), which internally uses > routines that were meant for communication from the client, and > therefore will convert everything back from the client encoding to the > server encoding. So this whole thing actually happens to work as long > as round tripping is possible between the involved encodings. > > In cases where it isn't, it's still hard to notice the difference > because depending on whether you get a parallel plan or not, the > following happens: > > not parallel: conversion error happens between server and client, > client sees an error message about that > > parallel: conversion error happens between worker and leader, worker > generates an error message about that, sends it to leader, leader > forwards it to client > > The client sees the same error message in both cases. > > To construct a case where this makes a difference, the leader has to > be set up to catch certain errors. Here is an example: > > """ > create table test1 (a int, b text); > truncate test1; > insert into test1 values (1, 'a'); > > create or replace function test1() returns text language plpgsql > as $$ > declare > res text; > begin > perform from test1 where a = test2(); > return res; > exception when division_by_zero then > return 'boom'; > end; > $$; > > create or replace function test2() returns int language plpgsql > parallel safe > as $$ > begin > raise division_by_zero using message = 'Motörhead'; > return 1; > end > $$; > > set force_parallel_mode to on; > > select test1(); > """ > > With client_encoding = server_encoding, this will return a single row > 'boom'. But with, say, database encoding UTF8 and > PGCLIENTENCODING=KOI8R, it will error: > > ERROR: 22P05: character with byte sequence 0xef 0xbe 0x83 in encoding > "UTF8" has no equivalent in encoding "KOI8R" > CONTEXT: parallel worker > > (Note that changing force_parallel_mode does not force replanning in > plpgsql, so if you run test1() first before setting > force_parallel_mode, then you won't get the error.) > > Attached is a patch to illustrates how this could be fixed. There > might be similar issues elsewhere. The notification propagation in > particular could be affected. Something like SetClientEncoding(GetDatabaseEncoding()) is a Little bit ugly. It would be nice if we could have a switch to turn off the automatic encoding conversion in the future, but for 9.6, I feel I'm fine with your proposed patch. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers